Unlikely-Video-663 t1_j284flc wrote on December 30, 2022 at 9:16 AM

Reply to [D] In vision transformers, why do tokens correspond to spatial locations and not channels? by stecas

In CNNs you usually already have long range dependencies channel wise - and imho one of the advantages of vit is allowing long range spatial information flow as well.

So channel-wise tokenization would not improve upon CNNs.. maybe?

Unlikely-Video-663 t1_izktnhx wrote on December 9, 2022 at 8:59 PM

Reply to [D] Making a regression NN estimate its own regression error by Alex-S-S

You might be able to recast the problem to assume the labels are acutally drawn from some distribution and put some simple liklihood function over it -- then learn the parameters of that distribution. This not theoretically sound, you wont capture any epistemic uncertainty, but most of the aleotoric, so dependend on your usecase, it might work.

In practice, use for example a Gaussian likelihood, and learn with a GaussianNLL Loss also the variance. As long as your samples stay within the same distribution, yadaya, this can work okish ..

Otherwise, there are plenty of recalibration techniques to get better results