memory leak in prediction tensorflow

Recently I trained two MLP model and saved weights for future work.
Load model function contains this code to load models:

def creat_model_extractor(model_path, feature_count):
    
    try:
        tf.keras.backend.clear_session()
        node_list = [1024, 512, 256, 128, 64, 32]

        model = Sequential()
        model.add(Input(shape=(feature_count,)))

        for node in node_list:
            model.add(Dense(node, activation='relu'))
            model.add(Dropout(0.2))
            model.add(LayerNormalization())

        model.add(Dense(16, activation='relu'))
        model.add(LayerNormalization())
        model.add(Dense(1, activation='sigmoid'))

        model.load_weights(model_path)
        model.trainable = False
        for layer in model.layers:
            layer.trainable = False
    except Exception as error:
        logger.warning(error, exc_info=True)
        return None

    return model

And this my prediction code

@tf.function
def inference(model, inputs):
    return tf.stop_gradient(model(inputs, training=False))

predictions = inference(setting.SMALL_MODEL, small_blocks_normal)
small_blocks['label'] = (predictions > 0.5).numpy().astype(int)

I have question about a memory leak.

After predicting labels, the memory is never released—even after the function finishes executing.

What should I do? I tried the following code, but it didn’t work.

I use CPU and don’t have any gpu.

import gc
tf.keras.backend.clear_session() 
del predictions
gc.collect()