I am working currently on the following problem: I have a regression model that outputs two values f1 and f2. The range of both values are [0, 1]. In my training dataset, I have 50 examples with f1 = 0 and 250 examples with f1 = 1. I also have 100 examples for f2 = 0 and 200 examples for f2 = 1. I want to use mean squared error as loss function.
I am working with keras to compile and fit a model. My question is: how can I change my loss function to address the imbalance that we have within the features (50 vs 250, 100 vs 200)?
I would appreciate any help.
PS. even though each output can be handeled as classification, I want to try out regression since the difference between f1 = 0 and f1 = 1 is not well defined.
Hi @Manu_Johny and Welcome to the Tensorflow forum.
If I understand well your set up you are looking are facing a standard imbalanced data issue.
You can check these 2 webpages:
Thank you for your reply. I do have an imbalanced data. But I want to use class weights since generating a balanced training data through over/undersampling is not easy in my case.
If a model has only one output layer that predicts the values in the range [0, 3], we can define class_weight as {0: weight0, 1:weight1, …} etc. But how can I define class_weight, if the number of output layers is more than 1? In other words, what happens if the model looks like this:
base_model = DenseNet121(include_top=False, input_shape=(224, 224, 3), weights="imagenet", pooling="avg")
x = base_model.output
F1 = Dense(1, activation=custom_activation, name='F1')(x)
F2 = Dense(1, activation=custom_activation, name='F2')(x)
...
model = Model(inputs=base_model.input, outputs=[F1, F2, F3, F4, F5])