Hi, for a university project I am trying to classify cervical cancer images. The dataset can be found on kaggle (Multi Cancer Dataset | Kaggle) and consists of 25’000 total images, 5’000 images per class.
The best model over 50 epochs provides the following metrics:
Metric | Value |
---|---|
Accuracy | 0.9916 |
Loss | 0.0817 |
Val Accuracy | 0.9962 |
Val Loss | 0.0674 |
Learning Rate | 1.2500e-04 |
With classification report:
Class | Precision | Recall | F1-Score | Support |
---|---|---|---|---|
Dyskeratotic | 1.00 | 1.00 | 1.00 | 1000 |
Koilocytotic | 1.00 | 0.99 | 0.99 | 1000 |
Metaplastic | 1.00 | 1.00 | 1.00 | 1000 |
Parabasal | 1.00 | 1.00 | 1.00 | 1000 |
Superficial-Intermediate | 1.00 | 1.00 | 1.00 | 1000 |
Accuracy | 1.00 | 5000 | ||
Macro Avg | 1.00 | 1.00 | 1.00 | 5000 |
Weighted Avg | 1.00 | 1.00 | 1.00 | 5000 |
The model does not seem to overfit as by the below graphs:
With the model architecture being really simple…
model = Sequential([
layers.Input(shape=(img_height, img_width, 3)),
layers.Rescaling(1./255),
layers.Conv2D(16, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Dropout(0.2),
layers.Conv2D(32, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Dropout(0.3),
layers.Conv2D(64, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Dropout(0.4),
layers.Flatten(),
layers.Dense(128, activation='relu', kernel_regularizer=regularizers.l2(0.001)),
layers.Dropout(0.5),
layers.Dense(num_classes)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
… I do not understand how we could achieve these metrics and suppose something went wrong, but I do not know what. Altough current state of the art classifiers for cervical cancer images reach good accuracy they are mostly trained on a pre-trained model like ResNet.
Thankful for any help regarding the matter.