Tensorflow.keras.LSTM vs. tf.contrib.rnn.LSTMBlockFusedCell

I have a TF 1.14 model which uses “tf.contrib.rnn.LSTMBlockFusedCell”, which I am trying to replicate in TF2.4. It is a variant of “DeepSpeech, v. 0.5.1”.

Both models have one LSTM and five Dense layers.

The layer weights are loaded from a DeepSpeech v. 0.5.1 Checkpoint into the TF2.4 model,
taking care to split kernel from recurrent_kernel, and re-ordering the blocks
(i,c,f,o) → (i,f,c,o) as suggested by a kind person here.

The models take same input, all other layers (the five Dense layers) have same inputs and outputs, only the LSTM layers have different outputs in teh two models.
The final outputs are in the same order of magnitude, but the TF2.4 result is not close to correct, that is: does not translate audio to text, which the TF1.14 model does almost satisfactorily.

Does anyone here know whether Tensorflow.keras.LSTM and tf.contrib.rnn.LSTMBlockFusedCell are in fact designed to work identically? Am I wasting my time trying to get the same results?

1 Like

I don’t know if you could be interested to explore the upstream diff on how they are refactoring the model (to remove also the old contrib):

https://github.com/mozilla/DeepSpeech/pull/3485

3 Likes