How to save and load TensorFlow decision forest regression model for incremental learning

import pandas as pd

Read the data

file_path = ‘/content/finaldata3.csv’
df = pd.read_csv(file_path)

df

columns_to_delete = [‘Unnamed: 8’]

df.drop(columns=columns_to_delete, inplace=True)

X = df[[‘Material_Id’, ‘Vendor_Id’]]
y = df[‘Lead_Time’]

y = pd.to_numeric(y)

dataset = tfdf.keras.pd_dataframe_to_tf_dataset(df, label=“Lead_Time”, task=tfdf.keras.Task.REGRESSION)

model = tfdf.keras.RandomForestModel(task=tfdf.keras.Task.REGRESSION)
model.compile(metrics=[“mse”])
model.fit(dataset)

import pickle

with open(“/content/trained_model2.pkl”, “wb”) as f:
pickle.dump(model, f)

with open(“/content/trained_model2.pkl”, “rb”) as f:
pretrained_model = pickle.load(f)

new_data_file_path = ‘/content/mat1ven3.csv’
new_data_df = pd.read_csv(new_data_file_path)

X_new = new_data_df[[‘Material_Id’, ‘Vendor_Id’]]
y_new = new_data_df[‘Lead_Time’]

y_new = pd.to_numeric(y_new)

new_dataset = tfdf.keras.pd_dataframe_to_tf_dataset(new_data_df, label=“Lead_Time”, task=tfdf.keras.Task.REGRESSION)

pretrained_model.compile(metrics=[“mse”])

pretrained_model.fit(new_dataset)

I am getting the error as
ValueError: The model’s task attribute (CLASSIFICATION) does not match the task attribute passed to pd_dataframe_to_tf_dataset (REGRESSION).

Hi @Swasthik_Shivananda,

Welcome to the TensorFlow Forum .

It seems like there might be an issue with the model’s task attribute either during its initial creation or when loading it from the pickle file.
This typically happens if the model you saved was initially created with a classification task, but you’re trying to use it for regression now.

Here’s the corrected code

!pip install tensorflow_decision_forests
import pandas as pd
import tensorflow_decision_forests as tfdf
import pickle

# Reading the data
file_path = '/content/finaldata3.csv'
df = pd.read_csv(file_path)

# Dropping unnecessary columns
columns_to_delete = ['Unnamed: 8']
df.drop(columns=columns_to_delete, inplace=True)

# Extracting features and labels
X = df[['Material_Id', 'Vendor_Id']]
y = df['Lead_Time']

# Converting the label to numeric
y = pd.to_numeric(y)

# Creating the dataset for TensorFlow Decision Forests
dataset = tfdf.keras.pd_dataframe_to_tf_dataset(df, label="Lead_Time", task=tfdf.keras.Task.REGRESSION)

# Creating and training the model
model = tfdf.keras.RandomForestModel(task=tfdf.keras.Task.REGRESSION)
model.compile(metrics=["mse"])
model.fit(dataset)

# Saving the model
with open("/content/trained_model2.pkl", "wb") as f:
    pickle.dump(model, f)

# Loading the model
with open("/content/trained_model2.pkl", "rb") as f:
    pretrained_model = pickle.load(f)

# Reading new data
new_data_file_path = '/content/mat1ven3.csv'
new_data_df = pd.read_csv(new_data_file_path)

# Extracting features and labels from the new data
X_new = new_data_df[['Material_Id', 'Vendor_Id']]
y_new = new_data_df['Lead_Time']

# Converting the new labels to numeric
y_new = pd.to_numeric(y_new)

# Creating the new dataset
new_dataset = tfdf.keras.pd_dataframe_to_tf_dataset(new_data_df, label="Lead_Time", task=tfdf.keras.Task.REGRESSION)

# Ensuring the loaded model has the correct task
if pretrained_model.task != tfdf.keras.Task.REGRESSION:
    pretrained_model.task = tfdf.keras.Task.REGRESSION

# Compiling and retraining the loaded model with new data
pretrained_model.compile(metrics=["mse"])
pretrained_model.fit(new_dataset)

  • Ensure both the model initialization and pd_dataframe_to_tf_dataset use tfdf.keras.Task.REGRESSION.
  • Verify the model is compiled before fitting it with the new dataset.

Hope this helps

Thank You !

Hi, TF-DF developer here. Thank you for the question!

A few points to note

  • Pickling TF-DF models is not supported, i.e. the loaded model is empty and cannot be used. This is what causes the issue here. Instead, just save and load the model without pickle.
  • A model task has to be set in the constructor. pretrained_model.task = tfdf.keras.Task.REGRESSION fails immediately with an error message.

For faster replies to questions about TF-DF, please post them on the Issues or Discussions page of the TF-DF Github repository GitHub - tensorflow/decision-forests: A collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models in Keras.