Why does my validation loss increase, but validation accuracy perfectly matches training accuracy?

Nick_O · September 9, 2021, 2:49pm

I am building a simple 1D convolutional neural network in Keras. Here is the model:

def build_model():

    model = models.Sequential()
    model.add(layers.SeparableConv1D(64, kernel_size=2, activation="relu", input_shape=(64,20)))
    model.add(layers.SeparableConv1D(64, kernel_size=2, activation="relu"))
    model.add(layers.MaxPooling1D(4))
    model.add(layers.Flatten())
    model.add(layers.Dense(128, activation="relu"))
    model.add(layers.Dense(128, activation="relu"))
    model.add(layers.Dropout(0.1))
    model.add(layers.Dense(1, activation="sigmoid"))

    model.compile(
        optimizer='rmsprop',
        loss='binary_crossentropy',
        metrics=[
            keras.metrics.BinaryAccuracy(),
        ],
    )
    
    #model.summary()
    
    return model

When I train my model on roughly 1500 samples, I always get my training and validation accuracy completely overlapping and virtually equal, reflected in the graph below. This is making me think there is something fishy going on with my code or in Keras/Tensorflow since the loss is increasing dramatically and you would expect the accuracy to be affected at least somewhat by this. It looks like it is massively overfitting and yet only reporting the accuracy values for the training set or something along those lines. When I then test on a test set, the accuracy is nowhere near the 85 to 90 percent reported on the graph, but rather ~70%.

Any help is greatly appreciated, I have been stuck on this for the longest time. Below is the training code.

#Define the number of folds... this will give us an 80/20 split
k = 5
epochs = 100
num_val_samples = len(x_train) // k
scores_binacc = []
scores_precision = []
scores_recall = []
histories = []

#Train the dense model in k iterations
for i in range(k):
    print('Processing fold #', i)
    val_data = x_train[i * num_val_samples : (i + 1) * num_val_samples]
    val_targets = y_train[i * num_val_samples : (i + 1) * num_val_samples]
    
    print('Validation partition =  ', i * num_val_samples, (i + 1) * num_val_samples)
    print('Training partition 1 = ', 0, i * num_val_samples)
    print('Training partition 2 = ', (i+1) * num_val_samples, len(x_train))
    
    partial_train_data = np.concatenate(
        [
            x_train[:i * num_val_samples],
            x_train[(i+1) * num_val_samples:]
        ], 
        axis=0
    )
    
    partial_train_targets = np.concatenate(
        [
            y_train[:i * num_val_samples],
            y_train[(i+1) * num_val_samples:]
        ],
        axis=0
    )
    
    model = build_model()
    h = model.fit(
        partial_train_data, 
        partial_train_targets, 
        validation_data=(val_data, val_targets),
        epochs=epochs, 
        verbose=1
    )
    
    val_loss, val_binacc = model.evaluate(val_data, val_targets, verbose=0)
    scores_binacc.append(val_binacc)
    #scores_precision.append(val_precision)
    #scores_recall.append(val_recall)
    histories.append(h)

JMG · September 9, 2021, 6:01pm

Maybe you’re overfitting but the underlying relationships are simple so your validation set still has decent accuracy but higher loss.

I feel like the change in accuracy could be caused by shuffling. Are you shuffling your data during training but not on test data? Does order matter for your problem?

Supachan_Traitruengs · October 28, 2021, 5:09am

Your dataset is very small, causing your model overfitting. You should try the following options:

Augment your data.
Reduce learning rate or “scheduling lr”.
Study this paper: A disciplined approach to neural network hyper-parameters: Part 1 – learning rate, batch size, momentum, and weight decay [1803.09820] A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay.
Take a look at model prediction and compare the groundtruth.
Change metrics to F1 score.

Hope above answers help!
Supachan

mrkirios · April 4, 2024, 6:59am

This basically a bump because I’m having the same exact “issue”. In my case, I’m training a Siamese Network for Face Recognition. There’s a lot going on, I’m implementing a custom loss function (Contrastive Loss) and a custom distance layer for computing distance between face embeddings.

Anyway, I’m seeing during training that my training and validation accuracy are almost exactly the same (the training accuracy has a minimal advantage of maybe 1%, but otherwise the curves are almost the same). However, train accuracy steadily decrements (as it should) but validation loss increases, just as you showed. Accuracy in training & validation reach an impressive 92% but my test accuracy is just about 70%. Nowhere near the promised figures.

I’m very puzzled. So, did you find the cause of the problem? I’ve extensively gone through my code to find the culprit, because this isn’t normal and I’m almost positive that I messed something up. I use a generator (that I wrote) because my dataset has 30,000 images and I can’t load them directly in RAM. I’m suspecting this might be associated with the problem. Just wanted to know if you found the source of the issue.

Topic		Replies	Views
Accuracy and Validation Accuracy don't increase TensorFlow tfkeras , model	1	43	November 4, 2024
Validation loss keeps increasing even though accuracy levels out General Discussion validation , accuracy	1	116	May 24, 2024
Unstable training CNN model General Discussion models , help_request	7	1287	September 3, 2021
Keras CNN training curve changed significantly since using TFRecords instead of ImageDataGenerator General Discussion distributed-training , keras , help_request	1	978	September 9, 2022
Loss become nan after some epochs General Discussion datasets , keras , help_request	5	4189	February 13, 2023

Why does my validation loss increase, but validation accuracy perfectly matches training accuracy?

Related topics