How does the Bidirectional layer handle the masked timesteps when merging the outputs of the forward and backward LSTMs?

lht9916 · January 30, 2022, 11:06am

Hello!

I have variable sequence lengths, so I need to pad and mask them to a fixed length of timesteps.

I was wondering how exactly the Bidirectional layer handles the masked timesteps when merging the outputs of the forward and backward LSTMs (the LSTM has return_sequences=True)?

For example, suppose an input sequence is [1.0, 2.0, 3.0], and I pad it to length-5 by using -1.0 to become [1.0, 2.0, 3.0, -1.0, -1.0]. I use the Masking layer to mask the last two timesteps. I then feed the masked sequence to the Bidirectional(LSTM) like the following:

output = Bidirectional(LSTM(1, return_sequences=True))(masked_input)

Suppose the output of the forward LSTM is [0.1, 0.2, 0.3, 0.0, 0.0] since the last two timesteps are masked, and the output of the backward LSTM is [0.0, 0.0, 0.4, 0.5, 0.6]. If using concatenate mode, will the Bidirectional layer merge these two outputs like the following?

[[0.1, 0.0], [0.2, 0.0], [0.3, 0.4], [0.0, 0.5], [0.0, 0.6]]

Or will it merge them like the following?

[[0.1, 0.4], [0.2, 0.5], [0.3, 0.6], [0.0, 0.0] [0.0, 0.0]]

I hope it is the second case.

If it is the first case, the result would be different from directly using the input [1.0, 2.0, 3.0] without padding and masking, and this is not what we want. Especially when the padding part is longer than the original sequence length (e.g. a length-3 sequence is padded to length-7), all the output values of LSTMs would be concatenated with 0.0.

I would appreciate your advice on how the Bidirectional layer does the merge in Keras when there are masked timesteps.

Thank you very much!

lht9916 · February 11, 2022, 2:42am

I would appreciate advice from the Keras team or community on this question.
Thank you very much!

Nafees · March 25, 2022, 6:07am

@lht9916 I’m also dealing with the zero-padding and masking issue. Please read my Post and share your thoughts. Please let me know if you did not receive a solution to your problem. In this case, I will look into your issue and see what can actually happen.

Topic		Replies	Views
Are zero padding values completely eliminated for downstream layers in LSTM? Keras models , help_request	1	1659	December 12, 2023
How to apply a hierarchical mask in Tensorflow2.0 (tf.keras) General Discussion models , timeseries , help_request	2	708	November 26, 2024
Can anybody tell me how to use mask argument in LSTM cell General Discussion help_request	1	253	July 24, 2024
How does masking work in Tensorflow Keras General Discussion api , keras	2	1661	February 14, 2023
Lstm in kears and tensorflow General Discussion models , keras , help_request	1	846	August 7, 2023

How does the Bidirectional layer handle the masked timesteps when merging the outputs of the forward and backward LSTMs?

Related topics