My model get 98 percent accuracy in training but when perform testing it gives 73 accuracy what the issue in my model & also predict severe 50% right on diabetic retinopathy

steps I follow

step 1:
load dataset (224,224) size image batch 32 mode multiclassfication on 5 classes mild , moderate , noDR , proliferative , severe

step 2:
split dataset training 80% & validation 20%

lenet model CNN I use 6 layers
first layer
Conv2D(filters=6, kernel_size=5, strides=1, padding=‘same’, activation=‘relu’),
BatchNormalization(),
MaxPool2D(pool_size=2, strides=2),
last layer

Flatten(),
Dense(100, activation=‘relu’),
BatchNormalization(),
Dense(num_classes, activation=‘softmax’)

** step 3 :apply training 50 epochs**

step 4 :

evaluation on training get 98 accuracy

step 5:

load testing dataset with same training setup

step 6:

apply testing lenet model get 73 accuracy

you can view the code of my model

##https://drive.google.com/file/d/1wanVx81D5X9ZSZfJjkQzo6N_Ln4jg-EP/view?usp=sharing##

1 Like

Hi @Palak_Talreja ,

It is a clear indication that the model has overfitted to the training data.Try to check for class imabalance ((from ROC curve, we could see the difference in the classes) , increasing the training data, use few more augmentation techniques increase the traing data, use K-fold cross validation technique to avoid the overfitting, use regularization suchs as drop out,try to find out the best parameters suits for your data set by using hyper parameter tuning.

After 25 epochs, the traing loss/accuracy and validation loss/accuracy was constant, it’s clear that model is overfitting with limited training data.

Thanks.

1 Like

@Palak_Talreja it looks like you have been working on this for some time already.
The model you shared above has more layers / is more more complex than the one you showed some weeks ago, if I remember well.
Aside the technical tricks suggested by @Laxma_Reddy_Patlolla you can try out (trial-and-error), maybe you can also take a more “theoretical” approach, that is studying again the research papers describing the models you use and see what is different in your actual setup e.g. purpose, number of classes, implementation tools (TF?), dataset size, etc… (I don’t know at all that litterature, I have to say…).