Basic Keras model underperforming against Scikit-Learn MLPRegressor

JHogg11 · June 29, 2022, 3:34am

I’ve experimented with sklearn’s MLPRegressor class and have seen that it does fairly well for the dataset I’m looking at without much tuning. However, I’d like to be able to build out a more complex model in Tensorflow/Keras using a split LSTM and Dense network.

To fulfill that end, I’m trying to first replicate the performance of MLPRegressor in Tensorflow for a very basic architecture but struggling so far.

Here’s an attempt to create identical models with each. The parameters in the TF implementation are intended to be based on the MLPRegressor documentation, including certain default values.

import numpy as np
import matplotlib.pyplot as plt

from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
from tensorflow.keras.wrappers.scikit_learn import KerasRegressor
from tensorflow.keras.regularizers import L2
from tensorflow.random import set_seed

from sklearn.neural_network import MLPRegressor
from sklearn.pipeline import Pipeline
from sklearn.compose import TransformedTargetRegressor
from sklearn.preprocessing import StandardScaler

from sklearn.datasets import make_regression
from sklearn.metrics import mean_squared_error


all_sk = []
all_tf = []

for _ in range(100):

    X, y = make_regression(n_samples=10000, n_features=20, n_informative=10, n_targets=1)
    # y = StandardScaler().fit_transform(y[:,None])[:,0]
    
    use_scaling = True
    seed = np.random.randint(0,1000)
    
    
    def simple_tf_model():
        set_seed(seed)
        dense_input = Input(shape=(X.shape[1],))
        dense = Dense(100, activation="relu", kernel_regularizer=L2(l2=0.0001))(dense_input)
        dense = Dense(1, activation="linear", kernel_regularizer=L2(l2=0.0001))(dense)     
        tf_model = Model(inputs=[dense_input], outputs=dense)     
        tf_model.compile(loss="mse",optimizer="adam")
        return tf_model
    
    
    sk_model = MLPRegressor(max_iter=5, hidden_layer_sizes=(100,), batch_size=200, random_state=seed, verbose=1)
    tf_model = KerasRegressor(build_fn=simple_tf_model, batch_size=200, epochs=5, validation_split=0.1)
    
    if use_scaling:
        sk_pipeline = Pipeline([('scaler', StandardScaler()), ('model', sk_model)])
        sk_model = TransformedTargetRegressor(regressor=sk_pipeline, transformer=StandardScaler()) 
        
        tf_pipeline = Pipeline([('scaler', StandardScaler()), ('model', tf_model)])
        tf_model = TransformedTargetRegressor(regressor=tf_pipeline, transformer=StandardScaler()) 
        

    sk_model.fit(X,y)
    tf_model.fit(X,y)
    
    sk_preds = sk_model.predict(X)
    tf_preds = tf_model.predict(X)
    
    
    def get_mse(preds, name):
        print(name, mean_squared_error(preds, y))
        
        if name == "SK":
            all_sk.append(mean_squared_error(preds, y))
        else:
            all_tf.append(mean_squared_error(preds, y))
    
    get_mse(sk_preds, "SK")
    get_mse(tf_preds, "TF")
    

sk_arr = np.array(all_sk)
tf_arr = np.array(all_tf)

print(sk_arr.mean()). # 350.0151514048654
print(tf_arr.mean()) # 382.19699899150226

Running the code above, there are two noticeable observations:

The loss during training is roughly half for MLPRegressor versus the TF model. This is also what I’ve observed on the real dataset.
The final MSE of the predictions on the training set is always lower for MLPRegressor (note: I’m not sure if the random seeds have the same effect on both models, but running the above in the loop and then comparing means should show this).

Any suggestions on why this might be are appreciated.

Renu_Patel · October 13, 2023, 4:54pm

Hi @JHogg11

Welcome to the TensorFlow Forum!

The tf.keras.wrappers.scikit_learn is replaced by SciKeras and offers many improvements over the TensorFlow version of the wrappers. Please use below code to import KerasRegressor in your code.

!pip install scikeras

from scikeras.wrappers import KerasRegressor

You need not to use the activation and regularizer function in the model layers for the regressor model and you will see significant improvement in loss.

    def simple_tf_model():
        set_seed(seed)
        dense_input = Input(shape=(X.shape[1],))
        dense = Dense(100,)(dense_input)
        dense = Dense(1, )(dense)
        tf_model = Model(inputs=[dense_input], outputs=dense)
        tf_model.compile(loss="mse",optimizer="adam")
        return tf_model


print(get_mse(sk_preds, "SK"))
print(get_mse(tf_preds, "TF"))

Output:

SK 367.130232440221
TF 1.5409398151758898e-05

Please have a look at this replicated gist for your reference. Thank you.

Topic		Replies	Views
Tensorflow Base and Keras Model Different Results General Discussion models , keras	0	498	May 3, 2023
Difference in performance with `keras.Sequential` in `keras.Model` subclass General Discussion models , datasets , keras , help_request	2	1031	October 28, 2021
Loss and val_loss are unreasonably HIGH - need ideas General Discussion custom-loss	1	171	March 4, 2024
Training loss not decreasing enough even after increasing the the model size General Discussion models , keras , performance	1	4033	March 8, 2023
Neural Network Not Learning Correctly TensorFlow tensorflow	4	484	September 28, 2023

Basic Keras model underperforming against Scikit-Learn MLPRegressor

Related topics