Automated Restarting of Training of Deep Learning Models in TensorFlow

I am trying to automate the (recursively) restart of a finished deep-learning training session in TensorFlow. Currently, to restart I am manually restarting my kernel and re-running the training code.

Questions:

  1. I understand that “when training deep learning models, the model’s parameters, activations, and gradients are stored in the GPU memory.” How would I clear the GPU memory without the need to manually restart my kernel?

  2. When I automate the restart of model training, do I need to restart from the very beginning (importing libraries + data preprocessing) OR can I just restart from where I start to build and fit the model?

  3. How would I implement this?

Thanks in advance!

Comment: This is how I call, compile, fit, and save the model.

    # Get model 
    def get_model():
        
        return build_model(input_shape, n_classes)
    
    uNet_model = get_model() 
    
    # Compile Model 
    
    uNet_model.compile(optimizer= tf.optimizers.Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['accuracy'])
    
    # Print Model Summary
    uNet_model.summary()
    
    # Fit Model 
    
    # This is for a one-hot coded model: non-sparse 
    history = uNet_model.fit(train_rgb_input, train_mask_categorical, 
                              batch_size=1,
                              epochs=1000, 
                              validation_data=(val_rgb_input, val_mask_categorical), 
                              # class_weight=class_weights, 
                              verbose=1, shuffle=True) 
    
    # Save model
    uNet_model.save("xxxx.hdf5")

if you want to “retrain” model without recompiling call only these lines

But how would I “refresh” my GPU without restarting my python kernel?

Thanks

it has no logical sens, because if you clear memory all you trained data will be lost.
To avoid gpu memory overloading you can set this param

or try this

personally I’m using only cpu with intel mkl optimizations on some packages