TFX custom config argument in trainer not working

James_Hope · November 23, 2022, 3:27am

This question is based on the TFX recommender tutorial. Please note that the code is being orchestrated by LocalDagRunner rather than run interactively in a notebook.

In the Trainer, we pass in a custom_config with the transformed ratings and movies:

trainer = tfx.components.Trainer(
    module_file=os.path.abspath(_trainer_module_file),
    examples=ratings_transform.outputs['transformed_examples'],
    transform_graph=ratings_transform.outputs['transform_graph'],
    schema=ratings_transform.outputs['post_transform_schema'],
    train_args=tfx.proto.TrainArgs(num_steps=500),
    eval_args=tfx.proto.EvalArgs(num_steps=10),
    custom_config={
        'epochs':5,
        'movies':movies_transform.outputs['transformed_examples'],
        'movie_schema':movies_transform.outputs['post_transform_schema'],
        'ratings':ratings_transform.outputs['transformed_examples'],
        'ratings_schema':ratings_transform.outputs['post_transform_schema']
        })

The problem is that all of the outputs passed into custom_config seem to be empty. This results in errors, for example

class MovielensModel(tfrs.Model):

  def __init__(self, user_model, movie_model, tf_transform_output, movies_uri):
    super().__init__()
    self.movie_model: tf.keras.Model = movie_model
    self.user_model: tf.keras.Model = user_model

    movies_artifact = movies_uri.get()[0]

complains that movie_uri.get() is empty. The same is true for ratings. Ratings passed in through the examples parameter however are not empty (the artefact uri is available), so it seems as though this custom_config is ‘breaking things’.

I have tried debugging it but to no avail. I did notice that arguments in custom_config are serialised and deserialised, but this didn’t seem to be the cause of the problem. Does anyone know why this happens and how to resolve this?

robertcrowe · November 28, 2022, 8:06pm

That seems really odd. I’ve reached out to see if anyone has any ideas, but there are a couple of things you can try. You can add print statements (logging.info, etc) to inspect the outputs from the components in the custom config. And you can also set a breakpoint (maybe using pdb) and inspect. If they’re empty then it suggests that the trainer isn’t waiting for them to finish, and there’s a race condition. It will also help to insert print statements in those components to verify when they run.

Topic		Replies	Views
What is the good way to access post-transform-statistics in Trainer (TFX) General Discussion tfx , help_request	6	1739	March 25, 2022
OP_REQUIRES failed at cast_op.cc:121 : UNIMPLEMENTED: Cast string to float is not supported General Discussion tfx	4	5194	January 4, 2023
TFX Transform layer returning an empty dictionary TFX-Addons tfx , pipelines	5	945	November 10, 2023
TFX Issue: RuntimeError: Failed to apply: CreateSavedModel[tf_v2_only] TFX-Addons models , tfx	4	978	August 16, 2023
Issue with Deserializing a Custom Transformer Model in TensorFlow Keras tfconfig , transformers	1	437	May 20, 2024

TFX custom config argument in trainer not working

Related topics