I’m doing a bounding box prediction regression problem using a CNN. However, the model predicts values that are way off, and are not even in the dataset.
The head of the dataframe:
(base) C:\Users\mitch\Documents\Projects\helmets-v2>python src/main.py                                                                                                                   
image      left     width       top    height   
57503_000116_Sideline_frame490.jpg  0.772549  0.019608  0.674510  0.025490                                                                    
1  57503_000116_Sideline_frame490.jpg  0.917647  0.021569  0.566667  0.023529 
2  57503_000116_Sideline_frame490.jpg  0.739216  0.017647  0.527451  0.023529                                                                            
3  57503_000116_Sideline_frame490.jpg  0.774510  0.017647  0.523529  0.019608                                                                          
4  57503_000116_Sideline_frame490.jpg  0.786275  0.023529  0.596078  0.019608 
Then, to generate the datasets, I use the flow_from_dataframe methods:
 data_gen = keras.preprocessing.image.ImageDataGenerator(
                dtype="float32",
                rescale=1/255,
                horizontal_flip=True,
                validation_split=0.2
            )
            train_gen = data_gen.flow_from_dataframe(dataframe=chunk,
                                                    directory=f"{IMAGES_DIR}",
                                                    x_col="image",
                                                    y_col=["left", "width", "top", "height"],
                                                    subset="training",
                                                    has_ext=True,
                                                    batch_size=BATCH_SIZE,
                                                    target_size=(int(NEW_WIDTH), int(NEW_HEIGHT)),
                                                    class_mode="other",
                                                    # save_to_dir=f"{IMAGES_DIR}/augmented",
                                                    shuffle=True,
                                                    seed=42,
            )
            val_gen = data_gen.flow_from_dataframe(dataframe=chunk,
                                                    directory=f"{IMAGES_DIR}",
                                                    x_col="image",
                                                    y_col=["left", "width", "top", "height"],
                                                    subset="validation",
                                                    has_ext=True,
                                                    batch_size=BATCH_SIZE,
                                                    target_size=(int(NEW_WIDTH), int(NEW_HEIGHT)),
                                                    class_mode="other",
                                                    shuffle=True,
                                                    seed=42
            )
This is my model:
 model = keras.models.Sequential([
            keras.layers.Input(shape=(NEW_HEIGHT, NEW_WIDTH, 3)),
            keras.layers.Conv2D(filters=64, kernel_size=(3,3), padding="same", kernel_regularizer="l2", activation="relu"),
            keras.layers.MaxPool2D(),
            keras.layers.Conv2D(filters=32, kernel_size=(3,3), padding="same", kernel_regularizer="l2", activation="relu"),
            keras.layers.MaxPool2D(),
            keras.layers.Conv2D(filters=16, kernel_size=(3,3), padding="same", kernel_regularizer="l2", activation="relu"),
            keras.layers.MaxPool2D(),
            keras.layers.Dropout(0.2),
            keras.layers.Flatten(),
            keras.layers.Dense(64, activation="relu"),
            keras.layers.Dropout(0.2),
            keras.layers.Dense(32, activation="relu"),
            keras.layers.Dropout(0.2),
            keras.layers.Dense(16, activation="relu"),
            keras.layers.Dropout(0.2),
            keras.layers.Dense(4),
        ])
And this is how I train:
        opt = keras.optimizers.Adam(learning_rate=0.0001)
        self.model.compile(
            optimizer=opt, 
            loss="mse",
        )
Training for 5 epochs yields the following results:
50/50 [==============================] - 17s 272ms/step - loss: 1.0056 - val_loss: 0.8319                                                                
Epoch 2/5                                                                                                                                                50/50 
[==============================] - 13s 260ms/step - loss: 0.8965 - val_loss: 0.7225                                                                
Epoch 3/5                                                                                                                                                50/50 
[==============================] - 13s 260ms/step - loss: 0.8164 - val_loss: 0.6626                                                                
Epoch 4/5                                                                                                                                                50/50 
[==============================] - 13s 260ms/step - loss: 0.7342 - val_loss: 0.6168                                                                
Epoch 5/5                                                                                                                                                50/50 
[==============================] - 13s 260ms/step - loss: 0.6926 - val_loss: 0.5710     
I thought, since the val_loss is slowly going down, the predictions must be getting closer.
However, upon predictions, it predicts values that are just way too big, or don’t make any sense at all (like negative values)
[[203.84532   26.675478 116.79072  -11.553452]]    
I don’t quite understand what I’m doing wrong here. I’m scaling the bounding boxes appriopriately, and the validation loss is quite low. I don’t understand how the predictions can be so off.