🧠

Typical machine learning systems do so by treating the prediction problem as a classification problem and computing scores for each outcome using a giant so-called softmax layer, which transforms raw scores into a probability distribution over words. (설명1) With this technique, the uncertainty of the prediction is represented by a probability distribution over all possible outcomes, provided that there is a finite number of possible outcomes. (설명2) In CV, on the other hand predicting “missing” frames in a video missing patches in an image, or missing segment in a speech signal involves a prediction of high-dimensional continuous objects rather than discrete outcomes. (설명3) It is not possible to explicitly represent all the possible video frames and associate a prediction score to them. In fact, we may never have techniques to represent suitable probability distributions over high-dimensional continuous spaces, such as the set of all possible video frames. (설명4)

출처

수집시간

2021/09/26 01:19

연결완료

1 more property

설명

아니나다를까 바로 뒤에 finite number 이라고 한번 더 이야기해준다.

이와 달리 비전 태스크는 보통 연속적이다. (즉, 다음에 나올 수 있는 애들이 무한하다.)

설명1에서 이야기했듯, 앞으로도 이렇게 굉장히 고차원에서의 연속적인 확률분포를 "giant so-called softmax layer, which transforms raw scores into a probability distribution" 에서 모델링할 수는 없을거다!