So I tried to swap out ResNet for Efficientdet-B3 in the Eager Few Shot OD Training TF2 tutorial.
Now, based on all the positive feedback Efficientdet got I am very surprised that ResNet outperformed Efficientdet on this tutorial. In total Efficientdet got trained on 1700 batches in the tutorial, while I ran ResNet through the standard batch size of 100.
Efficientdet-B3 for the last 1000 batches I run of a total of 1700:
batch 950 of 1000, loss=0.21693243
batch 955 of 1000, loss=0.18070191
batch 960 of 1000, loss=0.1715184
batch 965 of 1000, loss=0.23656633
batch 970 of 1000, loss=0.16813375
batch 975 of 1000, loss=0.23602965
batch 980 of 1000, loss=0.14852181
batch 985 of 1000, loss=0.18400437
batch 990 of 1000, loss=0.22741726
batch 995 of 1000, loss=0.20477971
Done fine-tuning!
ResNet for 100 batches:
batch 0 of 100, loss=1.1079819
batch 10 of 100, loss=0.07644452
batch 20 of 100, loss=0.08746071
batch 30 of 100, loss=0.019333005
batch 40 of 100, loss=0.0071129226
batch 50 of 100, loss=0.00465827
batch 60 of 100, loss=0.0041421074
batch 70 of 100, loss=0.0026128457
batch 80 of 100, loss=0.0023376464
batch 90 of 100, loss=0.002139934
Done fine-tuning!
Why does Efficientdet need so much more training time than ResNet, is it due to that the number of parameters is only about 12 mill for Efficientdet-B3 (the one I tested) and about 25 mill for the ResNet50? Or are their other reasons?
The end result (the .gif at the end of the tutorial) also shows a huge different in accuracy, where ResNet performs much better.
Thanks for any input!