This question is based on the TFX recommender tutorial. Please note that the code is being orchestrated by LocalDagRunner
rather than run interactively in a notebook.
We have a MovieModel with a compute_loss function as follows:
class MovielensModel(tfrs.Model):
def __init__(self, user_model, movie_model, tf_transform_output: TFTransformOutput, movie_uris: List[str]):
super().__init__()
self.movie_model: tf.keras.Model = movie_model
self.user_model: tf.keras.Model = user_model
movie_files = glob.glob(os.path.join(movie_uris[0], '*'))
movies = tf.data.TFRecordDataset(movie_files, compression_type="GZIP")
movies_dataset = extract_str_feature(movies, 'movie_title')
loss_metrics = tfrs.metrics.FactorizedTopK(
candidates=movies_dataset.batch(128).map(movie_model)
)
self.task: tf.keras.layers.Layer = tfrs.tasks.Retrieval(
metrics=loss_metrics
)
def compute_loss(self, features: Dict[Text, tf.Tensor], training=False) -> tf.Tensor:
# We pick out the user features and pass them into the user model.
# try:
user_embeddings = tf.squeeze(self.user_model(features['user_id']), axis=1)
# And pick out the movie features and pass them into the movie model,
# getting embeddings back.
print(features['movie_title'])
print(type(features['movie_title']))
positive_movie_embeddings = self.movie_model(features['movie_title'])
# The task computes the loss and the metrics.
_task = self.task(user_embeddings, positive_movie_embeddings)
# except BaseException as err:
# logging.error('######## ERROR IN compute_loss:\n{}\n###############'.format(err))
return _task
The model fails on line positive_movie_embeddings = self.movie_model(features['movie_title'])
with error
W tensorflow/core/framework/op_kernel.cc:1722] OP_REQUIRES failed at cast_op.cc:121 : UNIMPLEMENTED: Cast string to float is not supported
We also see in the trace:
Node: 'sequential_1/Cast'
Cast string to float is not supported
[[{{node sequential_1/Cast}}]] [Op:__inference_train_function_1214007]
ERROR:absl:######## ERROR IN run_fn during fit:
Graph execution error:
We see that Features['movie_title']
is of type
SparseTensor(indices=Tensor("inputs_21_copy:0", shape=(None, 2), dtype=int64), values=Tensor("inputs_22_copy:0", shape=(None,), dtype=string), dense_shape=Tensor("inputs_23_copy:0", shape=(2,), dtype=int64))
The string values are as expected as the data files for this tutorial contain movie titles as strings.
I have looked at the other SO posts on this error but cannot relate them to this context. What could be causing this issue?