TF Object Detection retrain a model based on new datasets

ChrisLo · December 8, 2022, 2:16pm

Hi guys. Quick question. Maybe one of you can help me with it.

Let’s say I have trained a model that detects a specific object in images like cats and after a few weeks I collected new data that I would like to use to further improve my model. Can I continue the training of the saved model with the new data or do I have to “create” (train) a new model based on the old and new data sets?
Nothing, concerning what I want the object detection model to detect, changes. I simply want to give it more data to improve but I can’t find an answer to the question, if I should train a completely new model or if I can continue the training of the current one?

Thx for taking your time to help me out.

Tanya · December 18, 2023, 6:46pm

@ChrisLo Welcome to Tensorflow Forum!

You’re correct about having two options for improving your cat detection model with new data: continuing training the existing model or training a new one from scratch with both the old and new data. Both approaches have their advantages and disadvantages:

1. Continuing training the existing model:

Advantages:
- Faster training time as the model already has some knowledge about cats.
- Potentially better performance on initial examples it was trained on due to retaining those learned features.
- Avoids training a model from scratch, saving computational resources.
Disadvantages:
- Might be prone to overfitting to the new data, especially if the new data has different characteristics or biases.
- May not learn new features effectively if the initial model architecture is limited.
- Requires careful consideration of learning rate and optimizer settings to avoid destabilizing the existing training progress.

2. Training a new model from scratch with both datasets:

Advantages:
- Leverages all available data, potentially leading to better overall performance and adaptability.
- Less risk of overfitting to specific characteristics of either dataset.
- Allows for utilizing a different model architecture if needed to capture new features in the data.
Disadvantages:
- Requires more training time than continuing the existing model.
- May potentially lose some performance on the initial examples used in the first training round.
- Requires more computational resources, especially for complex model architectures.

Choosing the best approach:

The best approach depends on several factors, including:

Size and diversity of the new data: If the new data is small or doesn’t significantly differ from the old data, continuing training might be sufficient.
Desired model performance: If you need the best possible performance across all types of cat images, training a new model from scratch might be better.
Computational resources: Training a new model from scratch requires more resources, so continuing training might be preferred if resources are limited.
Time constraints: Continuing training might be faster, especially for large datasets.

Recommendation:

It’s generally recommended to try both approaches and compare the resulting models’ performance on a validation set. This will give you the most concrete data-driven decision on which approach is best for your specific case.
There’s no one-size-fits-all answer, and the best approach depends on your specific needs and data.

I hope this helps!

Topic		Replies	Views
Retraining model on new images General Discussion education , training	2	915	February 15, 2022
Fine tune a pre-trained model to add new custom classes General Discussion model_garden	1	864	March 16, 2023
Train more object detection models on top of existing model General Discussion datasets , model_garden , help_request	1	947	September 24, 2021
Retraining Object detection model tutorials General Discussion model_garden , tfhub , help_request	4	2416	November 9, 2021
How to add new class to existing dataset? General Discussion model_garden , help_request	1	1243	December 6, 2023

TF Object Detection retrain a model based on new datasets

Related topics