I am working on deep learning based recommendation system movielens100K dataset in that i have features such as user id, movieid, ratings ,title, genre I would like to know how can convert the title of movie or genre of the movie in to embeddings and at input layer I would be concatenating all the embedding vectors
Have you already checked these two tutorials?
The following worked for me
unique_title_ds = <dataset of unique titles>
max_tokens = 10_000
embedding_dimension = 32
self.title_vectorizer = tf.keras.layers.TextVectorization(max_tokens = max_tokens)
self.title_text_embedding = tf.keras.Sequential([
self.title_vectorizer,
tf.keras.layers.Embedding(max_tokens, embedding_dimension, mask_zero = True),
tf.keras.layers.GlobalAveragePooling1D(),
])
self.title_vectorizer.adapt(unique_title_ds)
And invoking my item model concatenates the item embedding with the title embedding:
def call(self, items):
return tf.concat([
self.item_embedding(items),
self.title_text_embedding(items),
], axis = 1)
Hi if we consider only one genre for a movie can we use label encoding instead of one hot encoding