I have trained a model on about 2000 images (10000 steps). At this point I have created 1500 more images that I would like to train the model on.
Should I include these new images with the previous 2000 images and then train the model on the total of 3500 images. Or, should I train the model only on the new images? What is the best practice here in order to improve accuracy?
I believe it depends on how much different (if at all) these new images are. ML seems like a lot of try and see what works. Maybe try a mix of options and just see what works best. If you only train on the new images perhaps you might want to reduce the learning rate by some factor ( say original_lr/10 ) so your model doesn’t forget what it learned from the first images it trained on. You should be able to test different ideas by loading the current model’s weights into a test model.
When you are fine-tuning a network you always need to take care about how much your are changing your original weights (e.g. lr, how much layers you have unfreezed etc…) cause without any extra protection/technique catastrophic forgetting could be always behind the corner:
Another solution is to explore some Active learning techniques: