you-get-an-upvote t1_ir9978p wrote on October 6, 2022 at 8:09 AM

Reply to comment by Zatania in [R] Google Colab alternative by Zatania

FYI loading many small files from drive is very slow. If this applies to you, I recommend zipping the files, uploading to drive, copying the zipped file onto your colab machine, and unzipping.

from google.colab import drive

drive.mount('/content/drive')

!cp '/content/drive/My Drive/foo.zip' '/tmp/foo.zip'

os.chdir("/tmp")

!unzip -qq 'foo.zip'

Otherwise, if your dataloader is trying to copy files over from Drive one at a time it's going to be really slow.

Also I'd make sure you're not accidentally loading the entire dataset into RAM (assuming your crash is due to lack of RAM?).

you-get-an-upvote t1_iqnunxf wrote on October 1, 2022 at 7:47 PM

Reply to [D] Focal loss - why it scales down the loss of minority class? by Lugi

The alpha used in the paper is the inverse of the frequency of the class. So class1 is scaled by 4 (i.e. 1 / 0.25) and class2 is scaled by 1.33 (1/0.75).

But also I want to take this moment to talk about focal loss.

The point of focal loss really isn't downweighting common classes. Note that the original definition of focal loss in the paper doesn't use α. The formula you give is the "α-balanced variant of focal loss" which the authors "adopt in [their] experiments as it yields slightly improved accuracy over the non-α-balanced form".

What focal loss does do is decrease the importance of "easy" examples on the loss -- that is, it decreases the importance of examples that the model gets very correct. When datasets are imbalanced, common classes tend to be "easy" in this sense.

For example, consider a class that is 99% classA and 1% classB. A trivial model will predict every datapoint has a 99% chance of being classA, which will result in a very low loss for classA datapoints and a very high loss for classB datapoints.

Note, though, that these are not the same thing, since the more common class doesn't have to be the easier one. Suppose I train a model on CIFAR10 but add an additional "image is a solid color" class. Even if this extra class has only 10% of the datapoints of the other classes, it's so easy to classify compared to the other classes that focal loss will assign it lower weight.