Hi, I am working on a project where I am trying to predict a pathology from patient anamnesis. The vocab size is 55000, and there are 3 classes.
I use an embedding layer before an LSTM/GRU
What is the best embedding dimension for this case?
Hi @Francesca_Pisani .
The documentation of keras.layers.Embedding
is here.
In your case, the embedding layer will probably look like this:
tf.keras.layers.Embedding(input_dim=50000, # Number of words in vocabulary
output_dim=EMBEDDING_DIMS, # Dimension of the dense embedding
input_length=MAX_LEN) # Length of the largest input sequences
Note the number of classes (3) is something you’ll take into account in another layer down the road.