'StringLookup' object has no attribute 'vocab_size'

Mahesha_Madhushanka · September 8, 2024, 3:20pm

hello, I just tried the tutorial of TensorFlow Recommenders: Quickstart. but I cannot pass beyond this section

# Define user and movie models.
user_model = tf.keras.Sequential([
    user_ids_vocabulary,
    tf.keras.layers.Embedding(user_ids_vocabulary.vocab_size(), 64)
])
movie_model = tf.keras.Sequential([
    movie_titles_vocabulary,
    tf.keras.layers.Embedding(movie_titles_vocabulary.vocab_size(), 64)
])

# Define your objectives.
task = tfrs.tasks.Retrieval(metrics=tfrs.metrics.FactorizedTopK(
    movies.batch(128).map(movie_model)
  )

the output throws an error

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-10-047eba7293c2> in <cell line: 2>()
      2 user_model = tf.keras.Sequential([
      3     user_ids_vocabulary,
----> 4     tf.keras.layers.Embedding(user_ids_vocabulary.vocab_size(), 64)
      5 ])
      6 movie_model = tf.keras.Sequential([

AttributeError: 'StringLookup' object has no attribute 'vocab_size'

How can I get fix that?

Kiran_Sai_Ramineni · September 9, 2024, 1:39am

Hi @Mahesha_Madhushanka, The string lookup object does not have method vocab_size, try using user_ids_vocabulary.vocabulary_size() which will not produce that error. please refer to this gist for working code example. Thank You.

Mahesha_Madhushanka · September 9, 2024, 4:10am

hi @Kiran_Sai_Ramineni , Thank you for the response. yes, it cleared the issue, but after that, a new error appears in the FactorizedTopK.
here is the codeblock,

# Define user and movie models.
user_model = tf.keras.Sequential([
    user_ids_vocabulary,
    tf.keras.layers.Embedding(user_ids_vocabulary.vocabulary_size(), 64)
])
movie_model = tf.keras.Sequential([
    movie_titles_vocabulary,
    tf.keras.layers.Embedding(movie_titles_vocabulary.vocabulary_size(), 64)
])

# Define your objectives.
task = tfrs.tasks.Retrieval(metrics=tfrs.metrics.FactorizedTopK(
    movies.batch(128).map(movie_model)
  )
)

and the output error is this,

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-a45b7208100d> in <cell line: 12>()
     10 
     11 # Define your objectives.
---> 12 task = tfrs.tasks.Retrieval(metrics=tfrs.metrics.FactorizedTopK(
     13     movies.batch(128).map(movie_model)
     14   )

5 frames
/usr/local/lib/python3.10/dist-packages/keras/src/backend/common/variables.py in standardize_shape(shape)
    548             continue
    549         if not is_int_dtype(type(e)):
--> 550             raise ValueError(
    551                 f"Cannot convert '{shape}' to a shape. "
    552                 f"Found invalid entry '{e}' of type '{type(e)}'. "

ValueError: Cannot convert '('c', 'o', 'u', 'n', 't', 'e', 'r')' to a shape. Found invalid entry 'c' of type '<class 'str'>'.

Kiran_Sai_Ramineni · September 9, 2024, 5:33am

Hi @Mahesha_Madhushanka, After installing the required modules could please try to run this code

import os
os.environ['TF_USE_LEGACY_KERAS'] = '1'

to over come that error. Thank You.

Mahesha_Madhushanka · September 9, 2024, 6:36am

@Kiran_Sai_Ramineni , I tried that. Still the error continues on

task = tfrs.tasks.Retrieval(metrics=tfrs.metrics.FactorizedTopK(
    movies.batch(128).map(movie_model)
  )
)

error:


ValueError: Cannot convert '('c', 'o', 'u', 'n', 't', 'e', 'r')' to a shape. Found invalid entry 'c' of type '<class 'str'>'.

Kiran_Sai_Ramineni · September 9, 2024, 6:47am

Hi @Mahesha_Madhushanka, I have tried to execute that in the colab and did not face any error. please refer to this gist for working code example. Also let us know in which environment you are trying to execute the code. Thank You.

Mahesha_Madhushanka · September 10, 2024, 4:27am

Hi @Kiran_Sai_Ramineni YES, It did worked. but I found the difference between that code and the colab code in the documentation site,

user_ids_vocabulary = tf.keras.layers.StringLookup(mask_token=None)
user_ids_vocabulary.adapt(ratings.map(lambda x: x["user_id"]))

movie_titles_vocabulary = tf.keras.layers.StringLookup(mask_token=None)
movie_titles_vocabulary.adapt(movies)

when I replaced above code with this one you gave me,

user_ids_vocabulary=tf.keras.layers.StringLookup(vocabulary=unique_user_ids, mask_token=None)
movie_titles_vocabulary=tf.keras.layers.StringLookup(vocabulary=unique_movie_titles, mask_token=None)

Error was fixed. Thank you for helping me out.

Topic		Replies	Views
StringLookup layer broken after upgrade of tensorflow General Discussion tf-version , tensorflow	6	129	May 31, 2024
StringLookup with optimization like TFLite doesn't seem to be supported (or other lookup) General Discussion keras , tflite , help_request	5	561	February 6, 2023
Converting Words into ids using tf.keras.layers.StringLookup General Discussion nlp , keras , help_request	5	897	May 18, 2022
Understanding Tensorflow Recommenders embedding model General Discussion recommenders	0	645	February 9, 2023
Tensorflow Recommenders Incompatible Packages TensorFlow model-code , tfkeras	7	66	November 4, 2024

'StringLookup' object has no attribute 'vocab_size'

Related topics