Question about input_dim and mask_zero in embedding layers

Lu_Bin_Liu · June 15, 2021, 10:16pm

From the documentation on tf.keras.layers.Embedding:

input_dim:

Integer. Size of the vocabulary, i.e. maximum integer index + 1.

mask_zero:

Boolean, whether or not the input value 0 is a special “padding” value that should be masked out. This is useful when using recurrent layers which may take variable length input. If this is True, then all subsequent layers in the model need to support masking or an exception will be raised. If mask_zero is set to True, as a consequence, index 0 cannot be used in the vocabulary (input_dim should equal size of vocabulary + 1).

If my vocabulary size is n but they are encoded with index values from 1 to n (0 is left for padding), is input_dim equal to n or n+1? The maximum integer index + 1 part of the documentation is confusing me.
If the inputs are padded with zeroes, what are the consequences of leaving mask_zero = False?
If mask_zero = True, based on the documentation, I would have to increment the answer from my first question by one? What is the expected behaviour if this was not done?

Renu_Patel · December 29, 2023, 6:58am

Hi @Lu_Bin_Liu

To answer your queries for understanding on tf.keras.layers.Embedding though its clearly mentioned in the definition :

If your vocabulary size is n and indices range from 1 to n (0 is left for padding), then use input_dim=n+1 because input_dim specifies the total number of possible indices, including padding.
If you leave mask_zero=False while using padded zeros, then the model will treat padded zeros as actual tokens in the vocabulary which can lead to inaccurate results, especially when using recurrent layers where masked padding is often crucial for handling variable-length sequences.
If you set mask_zero=True but fail to increment input_dim accordingly, TensorFlow will raise an exception on vocabulary size mismatch because enabling masking requires reserving index 0 for padding, so increase the vocabulary size by 1.

Topic		Replies	Views
Does TransformerEncoder layer accept built-in mask? General Discussion api , keras , transformers	1	781	May 8, 2023
How to configure padding_idx from Pytorch Embedding layer to TensorFlow General Discussion docs , keras , pytorch , help_request	2	2418	October 11, 2021
Embedding dim in multiclass text classification Keras api , help_request	1	315	October 9, 2023
Tensorflow keras InvalidArgumentError Keras keras , help_request	1	592	July 8, 2024
Converting Words into ids using tf.keras.layers.StringLookup General Discussion nlp , keras , help_request	5	906	May 18, 2022

Question about input_dim and mask_zero in embedding layers

Related topics