Train on large dataset

HI,
I have loaded a large images into X_train and X_valid, shaped as (6000,100,224,224,3) and (2000,100,224,224,3)
I can’t train the model in one time; What I do is to break X_train into a small arrays like this:
X_train1 (3000, 100,224,224,3)
X_train2 (3000, 100,224,224,3)

Then i train the model on first X_train1, save the model weights, load weights into a new model then retrain the model on second X_train2;

Is it a good or bad way to handle this situation ?
Thank you

1 Like

@youb,

I don’t think it as a good practice. It is recommend for the model to see entire data during the epoch by shuffling the data so that the model is trained on entire distribution of training dataset.

However you can try to use tf.keras.utils.image_dataset_from_directory which generates a tf.data.Dataset from image files in a directory.

Thank you!