VectorQuantization: tf.gradient_stop (STE) inside a for loop/recurrent network

Cola_Lightyear · February 12, 2025, 12:40pm

Hi, I am trying to get straight through estimation running for a vector quantization (VQ) layer. This VQ layer is acting inside of a recurrent network I build. When I call the recurrent network with some input, what happens is:

            for i in tf.range(timesteps):
                encoded = self.encoder(input_data[i], state)
                encoded_q = self.quantizer(encoded)
                decoded, state = self.decoder(encoded_q, state)

                outputs = outputs.write(i, decoded)

The vector quantizer layer is just a copy of the implementation of standard VQ VAE as found in Vector-Quantized Variational Autoencoders.

Now, this raises an error when I try to train the model as the graph apparently cannot be tracked across the loop iterations.

Can this be solved somehow? I searched a lot and tried a lot, but to no avail.

Cola_Lightyear · June 15, 2025, 4:09pm

I solved this: It is best to use a class inheriting from tensorflows RNNCell (it is slightly faster than a custom tf.while_loop apparently) and most importantly to either hide the vq_loss inside a dummy state variable to then finally output the states at the very end of the rnn processing (or to somehow merge the vq_loss with the actual rnn output, but i never got that to work).

Topic		Replies	Views
@tf.custom_gradient for differentiable vector quantization General Discussion tfgradient , custom-model	1	224	February 24, 2024
Getting retracing error General Discussion	1	628	January 14, 2024
Graph execution error which I can't find anything about General Discussion tfdebug , epoc , model , tfvariable	1	327	January 8, 2024
GradientTape on eager mode General Discussion models , keras , help_request	1	721	July 6, 2021
Tensorflow retracing issue? TensorFlow models , tffunction	2	104	November 23, 2024

VectorQuantization: tf.gradient_stop (STE) inside a for loop/recurrent network

Related topics