TFDF evaluation metric

IamExperimenting_Now · June 27, 2024, 11:27am

I looked into TFDF it is really interesting and also when I looked into the examples specifically with regression type. I noticed {'loss': 0.0, 'mse': 4.355661392211914}

I’m wondering if loss is 0, ‘mse’ should also be zero right and it also says that model memorize the training data.

I’m confused here, would you be able to explain why loss is 0.0 ad mse is 4.355661392211914

link: Getting started | TensorFlow Decision Forests

rstz · June 27, 2024, 12:13pm

Hi, the loss reported during training by TF-DF is always 0 (for technical reasons, this is hard to change), so it can be ignored.

arnoldmatt · June 27, 2024, 1:54pm

A loss of 0.0 in TFDF with a non-zero MSE usually indicates overfitting. The model fit the training data perfectly (loss of 0) but isn’t generalizing well (high MSE). Try regularizing your model or using a validation set to catch this early.

IamExperimenting_Now · June 27, 2024, 3:36pm

@rstz it happened during evaluation.

# Evaluate the model on the test dataset.
model_7.compile(metrics=["mse"])
evaluation = model_7.evaluate(test_ds, return_dict=True)

print(evaluation)
print()
print(f"MSE: {evaluation['mse']}")
print(f"RMSE: {math.sqrt(evaluation['mse'])}")

should I ignore that value?

Then how do i get the real train loss and validation loss value to understand model performance?

rstz · July 1, 2024, 1:22pm

Yes, this value should be ignored.

You can get the validation loss in the inspector:

insp = model.make_inspector()
print(insp.evaluation().loss)
# The full sequence of validation losses is available in the training logs:
print([log.evaluation.loss for log in insp.training_logs()])

For the training logs, the procedure is a bit different:

raw_logs = insp.specialized_header().training_logs.entries
# Prints a list of all training losses
print([log.training_loss for log in raw_logs])

Note that only GradientBoostedTrees models have validation losses and training losses (since only they use a validation dataset). For RandomForests, these losses are either unavailable or computed on the out-of-bag dataset.

If you have further questions, please prefer our Github repo: GitHub - tensorflow/decision-forests: A collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models in Keras.

Topic		Replies	Views
TensorFlow Decision Forest Regression Model Log General Discussion models , tfdf , help_request	2	1104	July 3, 2021
Validation loss always equals zero for Instance / Semantic Segmentation with Model Garden TensorFlow models , model_garden , tfcore	1	461	January 23, 2024
Decision Forest module yanked General Discussion decision_forests , tfdf , random_forests	21	4626	March 13, 2023
Tensorflow - self.metrics does not contain compiled metrics in train_step() Keras tf-train , tfkeras	2	344	February 25, 2024
When training or inferring a model, identical values are obtained for different performance metrics Keras api , keras , metrics	1	714	September 5, 2023

TFDF evaluation metric

Related topics