Masking propagation through layers

mk6 · August 2, 2023, 3:49am

I’m confused about the handling of mask since there seems confict to me.

In tf.keras.layers.Masking: If any downstream layer does not support masking yet receives such an input mask, an exception will be raised.
In the guide Understanding masking & padding, section Passing mask tensors directly to layers tells me Layers that can handle masks (such as the LSTM layer) have a mask argument in their __call__ method.

So based on the above two points, my understanding is only if there is a mask argument in the __call__ method is the layer capable of handling masks. In addition, if there are layers that don’t have this argument when there is a masking layer in upstream, there will be an exception.

But the following example (modified from this example) doesn’t follow the above understanding as there’s no exception throwed, given that the call method doesn’t have mask (it does have attention_mask, but this is distinct)

import tensorflow as tf
import numpy as np
import tensorflow_models as tfm

samples, timesteps, features = 32, 10, 8
inputs = np.random.random([samples, timesteps, features]).astype(np.float32)
inputs[:, 3, :] = 0.
inputs[:, 5, :] = 0.

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Masking(mask_value=0.,
                                  input_shape=(timesteps, features)))
model.add(tfm.nlp.models.TransformerEncoder(
    num_layers=1,
    num_attention_heads=2,
    intermediate_size=16,
))

output = model(inputs)

Why isn’t here an exception raised? Is the masking layer actually working?

Any help will be appreciated!

Mansi_Mehta · September 19, 2024, 7:38am

Hi,

There is a exception if mask argument is not provide.
But here in your code you didn’t get exception because you used “TransformerEncoder” model after mask layer. So due this model will handle masking internally. Same way LSTM also can handle mask internally.

If you used below code then you’ll get an exception error while fitting the model. Because Dense layer won’t have functionality to handle mask internally.

import keras
model = keras.models.Sequential()
model.add(keras.layers.Masking(mask_value=0.))
model.add(keras.layers.Dense(32))

input_data = np.random.random((16, 10, 5))

input_data[:, 2, :] = 0
model.compile(optimizer='adam', loss='mse')
model.fit(input_data, np.random.random((16, 32)), epochs=5)

Let me know if you required more information. Thanks…!!

Topic		Replies	Views
Does TransformerEncoder layer accept built-in mask? General Discussion api , keras , transformers	1	780	May 8, 2023
How does masking work in Tensorflow Keras General Discussion api , keras	2	1640	February 14, 2023
Are zero padding values completely eliminated for downstream layers in LSTM? Keras models , help_request	1	1659	December 12, 2023
Can anybody tell me how to use mask argument in LSTM cell General Discussion help_request	1	253	July 24, 2024
MultiHeadAttention With 2 Attention Axes And An Attention Mask - How to apply mask General Discussion text-vectorization , tfvectorize	0	110	April 4, 2024

Masking propagation through layers

Related topics