How we can compare loss and metrics evaluated on different sizes of training and validation sets?

Rajesh_Nakka · August 24, 2022, 6:10am

Let N_train, N_val and N_test are the number of examples in training, validation, and test sets.

As I understood, in general these values are taken as N_train >> N_val ~= N_test.

As I understood, the loss and metrics are evaluated (in an average sense) on the whole training set. In this context, how can we compare the performance on sets of different sizes?

Why isn’t it like model performance is evaluated on a subset of the training set whose size is comparable to that of the validation or test set?

One can argue that it might increase the computational cost, but we can at least draw (randomly) from the losses evaluated during the training step.

Please let me know if there are any reasons behind this approach!

Kiran_Sai_Ramineni · July 19, 2023, 9:25am

Hi @Rajesh_Nakka, The loss is calculated based on training data not val, test data. During the training, the model finds the loss w.r.t to the train data (batch wise) and updates the weights. The trained model will be used to test on val data. Once the model is trained for specified epochs, the final model is used to run inference on test data. Please make sure that train set , val set and test set are independent to each other and come from same distribution. In short, the size of the val set and test set does not affect the loss but the train set does. Thank You.

Topic		Replies	Views
Is it correct to train and validate the model on F1-score metrics? TensorFlow models , datasets , help_request	4	1869	February 4, 2022
What is the approach for training and validation metrics evaluation while writing code from scratch? General Discussion api , keras , help_request , metrics	1	611	November 10, 2023
Why does my validation loss increase, but validation accuracy perfectly matches training accuracy? Keras models , keras , help_request	3	12322	April 4, 2024
KERAS model.fit training progress printout: training vs validation values Keras datasets , keras-model , training	1	188	June 11, 2024
Dataset input vs numpy array to model.fit gives different val loss General Discussion models , datasets , keras , help_request	1	1393	August 1, 2022

How we can compare loss and metrics evaluated on different sizes of training and validation sets?

Related topics