I’m doing a bounding box prediction regression problem using a CNN. However, the model predicts values that are way off, and are not even in the dataset.
The head of the dataframe:
(base) C:\Users\mitch\Documents\Projects\helmets-v2>python src/main.py
image left width top height
57503_000116_Sideline_frame490.jpg 0.772549 0.019608 0.674510 0.025490
1 57503_000116_Sideline_frame490.jpg 0.917647 0.021569 0.566667 0.023529
2 57503_000116_Sideline_frame490.jpg 0.739216 0.017647 0.527451 0.023529
3 57503_000116_Sideline_frame490.jpg 0.774510 0.017647 0.523529 0.019608
4 57503_000116_Sideline_frame490.jpg 0.786275 0.023529 0.596078 0.019608
Then, to generate the datasets, I use the flow_from_dataframe
methods:
data_gen = keras.preprocessing.image.ImageDataGenerator(
dtype="float32",
rescale=1/255,
horizontal_flip=True,
validation_split=0.2
)
train_gen = data_gen.flow_from_dataframe(dataframe=chunk,
directory=f"{IMAGES_DIR}",
x_col="image",
y_col=["left", "width", "top", "height"],
subset="training",
has_ext=True,
batch_size=BATCH_SIZE,
target_size=(int(NEW_WIDTH), int(NEW_HEIGHT)),
class_mode="other",
# save_to_dir=f"{IMAGES_DIR}/augmented",
shuffle=True,
seed=42,
)
val_gen = data_gen.flow_from_dataframe(dataframe=chunk,
directory=f"{IMAGES_DIR}",
x_col="image",
y_col=["left", "width", "top", "height"],
subset="validation",
has_ext=True,
batch_size=BATCH_SIZE,
target_size=(int(NEW_WIDTH), int(NEW_HEIGHT)),
class_mode="other",
shuffle=True,
seed=42
)
This is my model:
model = keras.models.Sequential([
keras.layers.Input(shape=(NEW_HEIGHT, NEW_WIDTH, 3)),
keras.layers.Conv2D(filters=64, kernel_size=(3,3), padding="same", kernel_regularizer="l2", activation="relu"),
keras.layers.MaxPool2D(),
keras.layers.Conv2D(filters=32, kernel_size=(3,3), padding="same", kernel_regularizer="l2", activation="relu"),
keras.layers.MaxPool2D(),
keras.layers.Conv2D(filters=16, kernel_size=(3,3), padding="same", kernel_regularizer="l2", activation="relu"),
keras.layers.MaxPool2D(),
keras.layers.Dropout(0.2),
keras.layers.Flatten(),
keras.layers.Dense(64, activation="relu"),
keras.layers.Dropout(0.2),
keras.layers.Dense(32, activation="relu"),
keras.layers.Dropout(0.2),
keras.layers.Dense(16, activation="relu"),
keras.layers.Dropout(0.2),
keras.layers.Dense(4),
])
And this is how I train:
opt = keras.optimizers.Adam(learning_rate=0.0001)
self.model.compile(
optimizer=opt,
loss="mse",
)
Training for 5 epochs yields the following results:
50/50 [==============================] - 17s 272ms/step - loss: 1.0056 - val_loss: 0.8319
Epoch 2/5 50/50
[==============================] - 13s 260ms/step - loss: 0.8965 - val_loss: 0.7225
Epoch 3/5 50/50
[==============================] - 13s 260ms/step - loss: 0.8164 - val_loss: 0.6626
Epoch 4/5 50/50
[==============================] - 13s 260ms/step - loss: 0.7342 - val_loss: 0.6168
Epoch 5/5 50/50
[==============================] - 13s 260ms/step - loss: 0.6926 - val_loss: 0.5710
I thought, since the val_loss is slowly going down, the predictions must be getting closer.
However, upon predictions, it predicts values that are just way too big, or don’t make any sense at all (like negative values)
[[203.84532 26.675478 116.79072 -11.553452]]
I don’t quite understand what I’m doing wrong here. I’m scaling the bounding boxes appriopriately, and the validation loss is quite low. I don’t understand how the predictions can be so off.