Hello everybody!
I am trying to train object detection model “efficientdet_lite0” on my custom data and when i run
“model = object_detector.create(train_data, model_spec=spec, epochs=50, batch_size=8, train_whole_model=True, validation_data=validation_data)”
command basically from tutorial “Google Colab”
on around epoch 17/50 google colaboratory silently stops execution and disconnects. When i run the same script for small amount of data training successfully finishes! I use GPU for training. I use around 8000 images for training.
- Maybe there is lack of resources? I can’t see resources usage during training.
- Is there any way to break training in batches of smaller amount of epochs and train batch after batch?
- Is there any way to export and import intermediate trained model to train in batches?
Any help would be appreciated!