How to build a multi-output model using a subclass?

gsimonx37 · July 8, 2024, 5:36am

You need to create a model that takes an image of a movie poster and returns an age rating (multi-class classification) and a list of genres (multi-label classification).

Model code:

class NNetwork(Model):
    def __init__(self, rating: int, genres: int):
        super(NNetwork, self).__init__()
        self.rescaling = Rescaling(1.0 / 255.0)

        self.convolution = [
            Conv2D(16, 3, padding='same', activation='relu'),
            Conv2D(32, 3, padding='same', activation='relu'),
            Conv2D(64, 3, padding='same', activation='relu')
        ]
        self.pooling = [
            MaxPooling2D(),
            MaxPooling2D(),
            MaxPooling2D()
        ]
        self.flatten = Flatten()

        self.dense = [
            Dense(128, activation='relu'),
            Dense(rating, activation='softmax'),
            Dense(genres, activation='softmax')
        ]

    def call(self, images: Tensor, training=None, **kwargs) -> tuple[Tensor, Tensor]:
        x = self.rescaling(images)

        for i in range(3):
            x = self.convolution[i](x)
            x = self.pooling[i](x)

        x = self.flatten(x)

        x = self.dense[0](x)

        rating = self.dense[1](x)
        genres = self.dense[2](x)

        return rating, genres

Loss function code:

def loss(true: Tensor,
         predict: Tensor) -> Tensor:
    
    categorical = CategoricalCrossentropy(
        reduction=None
    )

    loss = tf.reduce_mean(categorical(
        y_true=true,
        y_pred=predict
    ))

    return loss

Training code:

@tf.function
def train_step(images_batch, rating_batch, genres_batch):
    with tf.GradientTape() as tape:
        rating_predict, genres_predict = model(images_batch)
        
        rating_loss = loss(rating_batch, rating_predict)
        genres_loss = loss(genres_batch, genres_predict)
        
    gradients = tape.gradient([rating_loss, genres_loss], model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))

    rating_score.update_state(rating_batch, rating_predict)
    genres_score.update_state(genres_batch, genres_predict)
    
    return rating_loss, genres_loss

Main loop code:

for n in range(EPOCHS):
    total_loss = 0
    total_rating_loss = 0
    total_genres_loss = 0
    
    for inputs, outputs in train:
        images_batch = inputs['image']
        rating_batch = outputs['rating']
        genres_batch = outputs['genres']
        
        rating_loss, genres_loss = train_step(images_batch, rating_batch, genres_batch)

        total_loss += (rating_loss + genres_loss)
        total_rating_loss += rating_loss
        total_genres_loss += genres_loss
    
    print(f'EPOCHS: {n} - total_loss: {total_loss.numpy()}, total_rating_loss: {total_rating_loss.numpy()}, total_genres_loss: {total_genres_loss.numpy()}')

The learning process starts, but the loss function at each epoch is huge, 10 to the 15th power. This means that the learning algorithm is designed incorrectly.

I’m guessing what the problem might be:

The loss function was incorrectly selected or formatted.
Gradients are calculated and applied incorrectly.

What else could be the reason for such a large loss function?

tagoma · July 8, 2024, 8:16pm

Hi @gsimonx37. Is there any specific reason why you set reduction=None

categorical = CategoricalCrossentropy(
reduction=None
)

Hence, those bigs numbers, no?

Tensorflow documentation reads:

reduction : Type of reduction to apply to the loss. In almost all cases this should be "sum_over_batch_size".

Topic		Replies	Views
ResNet Implementation General Discussion models , help_request	4	601	September 11, 2021
My first image classification model has an error General Discussion models , datasets	5	1321	January 31, 2023
Issues with Custom Loss and setting tf.config.run_functions_eagerly(True) TensorFlow tfconfig	2	69	February 8, 2025
Multiclass semantic segmentation model does not learn General Discussion models , learning , help_request	3	2991	October 29, 2021
How to share layers when we use model subclassing? General Discussion models , keras , help_request	1	994	December 6, 2021

How to build a multi-output model using a subclass?

Related topics