Unstable training CNN model

Ahmed.boin · September 3, 2021, 12:58am

In the beginning, I made a CNN model train to predict cats vs dogs.
it was unstable during training however I get 2000 photos for training and another 1000 validation data set so that thought the problem with my model and I load a pre-trained model resnet50 and made a fully connected layer with a softmax of 2 outputs.
and it is still unstable.
I changed the batch_size, epochs, and steps_per_epochs much time and it is still unstable.
what should I do and what is my fault

#Data base
from os.path import join as p
from os import getcwd as g
train = p(g(), 'train')
validation = p(g(), 'validation')
train_cat = p(train, 'cat')
train_dog = p(train, 'dog')
validation_cat = p(validation, 'cat')
validation_dog = p(validation, 'dog')

from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.models import load_model, Sequential, Model
from tensorflow.keras.layers import Flatten, Dense, Dropout, Conv2D, MaxPooling2D, Input, GlobalMaxPooling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator as IDG
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint

i = Input(shape=(224, 224, 3))
x = ResNet50(weights='imagenet', include_top=False)(i)
x = GlobalMaxPooling2D()(x)
x = Flatten()(x)
x = Dense(1024, activation='relu')(x)
x = Dense(512, activation='relu')(x)
x = Dense(2, activation='softmax')(x)
model = Model(inputs=i, outputs=x)

model.get_layer('resnet50').trainable = False
model.summary()

model.compile(optimizer='adam',loss='categorical_crossentropy', metrics=['accuracy'])

stop = EarlyStopping(monitor='val_loss', patience=100)
saving = ModelCheckpoint('resnet50.h5', save_weights_only=False, save_best_only=True)

#adjust dataset before training
train_datagen = IDG( rescale = 1.0/255.,
                     rotation_range=40,
                     width_shift_range=0.2,
                     height_shift_range=0.2,
                     shear_range=0.2,
                     zoom_range=0.2,
                     horizontal_flip=True,
                     fill_mode='nearest')
test_datagen  = IDG( rescale = 1.0/255.)
train_generator = train_datagen.flow_from_directory(train,
                                                    batch_size=1,
                                                    class_mode='categorical',
                                                    target_size=(224, 224))
validation_ganerator = test_datagen.flow_from_directory(validation,
                                                        batch_size=1,
                                                        class_mode='categorical',
                                                        target_size=(224, 224))
#Model training
history = model.fit(train_generator,
                    validation_data=validation_ganerator,
                    steps_per_epoch=1,
                    epochs=100,
                    validation_steps=50,
                    verbose=1,
                    callbacks=[stop, saving])

#show results
import numpy as np
import matplotlib.pyplot as plt
acc      = history.history['accuracy']
val_acc  = history.history['val_accuracy']
loss     = history.history['loss']
val_loss = history.history['val_loss']

epochs   = range(len(acc)) # Get number of epochs
plt.plot  ( epochs,     acc )
plt.plot  ( epochs, val_acc )
plt.title ('Training and validation accuracy')
plt.figure()
plt.plot  ( epochs,     loss )
plt.plot  ( epochs, val_loss )
plt.title ('Training and validation loss')
plt.show()

Ekaterina_Dranitsyna · September 3, 2021, 8:28am

Why are you using a batch size of 1 and steps_per_epoch=1? It means that every epoch you train your model on exactly one train image and evaluate it on 50 validation images (since validation_steps=50).

Ahmed.boin · September 3, 2021, 12:28pm

I used before batch_size = 20 and step_per_epochs = 100 but it’s the same unstable so that i changed it to 1 and 1 for trying another way
And the same problem i have.
Any other ideas ?

Bhack · September 3, 2021, 2:17pm

What do you mean with “Is unstable”?

Ahmed.boin · September 3, 2021, 6:46pm

https://files.fm/thumb_show.php?i=hmwcq8hk7
https://files.fm/thumb_show.php?i=2t9bas9w4

Bhack · September 3, 2021, 7:27pm

Have you tried to save in saved model format instead of h5?
Also just to check that your datagen pipeline is ok try to check if you can overfit the train+validation and use a more consistent batch _size.

Ahmed.boin · September 3, 2021, 8:57pm

using save model format instead of h5 and batch_size of 20 photo, steps_per_epochs = 100 to get all 2000 photos trained and I train it for 10 epochs it takes about 1 hour to train and I got this
https://fv2-5.failiem.lv/thumb_show.php?i=ue58sz2f3&download_checksum=14cc5cf7ddebcb68554858d421da881ec7c64e71&download_timestamp=1630702313
https://fv2-5.failiem.lv/thumb_show.php?i=ew2hatesr&download_checksum=5aea3c29c4061a1b4e4dc52ea493135ed8860bd4&download_timestamp=1630702221

Bhack · September 3, 2021, 9:26pm

Visually Inspect your dataset augmentation with the related labels and check if you can overfit with a fine-tuning step:

Topic		Replies	Views
Why does my validation loss increase, but validation accuracy perfectly matches training accuracy? Keras models , keras , help_request	3	12357	April 4, 2024
The accuracy do not increase when I train model General Discussion models , keras , help_request	2	987	February 20, 2023
Not getting Accuracy while retraining the model General Discussion models , keras	3	399	October 9, 2023
Loss become nan after some epochs General Discussion datasets , keras , help_request	5	4174	February 13, 2023
Training loss not decreasing enough even after increasing the the model size General Discussion models , keras , performance	1	4029	March 8, 2023

Unstable training CNN model

Related topics