Technical-Owl-6919 t1_iwco8xr wrote on November 14, 2022 at 5:44 PM

Reply to comment by Tiny-Mud6713 in [P] Need help with this CNN transfer learning problem by Tiny-Mud6713

Yes and train them, everything is unfrozen.

Technical-Owl-6919 t1_iwclyq8 wrote on November 14, 2022 at 5:29 PM

Reply to comment by Tiny-Mud6713 in [P] Need help with this CNN transfer learning problem by Tiny-Mud6713

So my friend, then you have to train the network from scratch, it is getting trapped into a local minima. Maybe a small network might help. Try training a ResNet15 or something similar from scratch. This has happened with me once, I was working with Simulation Images and could not get the AuC score to go above 0.92, once I trained it from scratch I got AUC scores close to 0.99, 0.98 etc.

Technical-Owl-6919 t1_iwckvp7 wrote on November 14, 2022 at 5:22 PM

Reply to comment by Tiny-Mud6713 in [P] Need help with this CNN transfer learning problem by Tiny-Mud6713

Something seems to be wrong, the validation scores should not be so low. Exactly what type of data are you dealing with ?

Technical-Owl-6919 t1_iwch8sv wrote on November 14, 2022 at 4:58 PM

Reply to comment by Tiny-Mud6713 in [P] Need help with this CNN transfer learning problem by Tiny-Mud6713

See, from my experience I would ask you to use EfficientNets in the first place. Secondly please don't unfreeze the model at the very beginning. Train the frozen model with your custom head for a few epochs and when the loss saturates, reduce the Lr and unfreeze the entire network and train again. Btw did you try LR Scheduling ?

Technical-Owl-6919 t1_iwcakcv wrote on November 14, 2022 at 4:13 PM

Reply to [P] Need help with this CNN transfer learning problem by Tiny-Mud6713

One thing I don't know why anyone has not mentioned yet is, why have you kept two linear layers ?. Two linear layers one after the other in a Transfer learning case is something which will lead to very bad generalization. DenseNet is large enough to extract features and make them simple enough for single layers to understand. Try removing the dense layer between the output and functional(DenseNet). Also try swapping the Flatten with Global Max or Global Average Pooling.