Full integer conversion of LSTM cell

Lisa · May 27, 2024, 1:38pm

Can anyone help me understand what exactly is done when quantizing an LSTM layer (allowing only integers). I’m very familiar with the mathematical operations that are performed in a classic LSTM cell, but I’m having trouble understanding how sigmoids and tanhs are performed with lookup tables, and when rescaling is performed to avoid values that are too large and thus memory overruns. Also I don’t get what you mean when saying The input activations are quantized as uint8 on the interval [-1, 127/128]. I think I’ve found the source code here : https://github.com/tensorflow/tflite-micro/blob/be11bd79e4a8b28c9ee92c6f02ca0e85414fb768/tensorflow/lite/kernels/internal/reference/lstm_cell.h#L143

Kiran_Sai_Ramineni · June 7, 2024, 6:34am

Hi @Lisa, The quantization is the process of reducing the number of bits used to represent weights and activations in a neural network. This is done by scaling the values by a quantization factor and rounding them to the nearest integer values.

The look up table contains the pre-computed values of the activation function for a range of possible inputs. During runtime, the function value for a given input is retrieved from this table. please refer to this document to know more about lookup table.

If the scaled values before quantization are too large for the target data causing overflow. Rescaling will ensure the values to be within the representable limits.

All the floating point values will be represented between -1 to 127/128. Thank You.

Md_Sulaiman · July 16, 2024, 10:04pm

Example with LSTM Cell

In the context of an LSTM cell, the main operations (matrix multiplications, element-wise additions, and nonlinear functions) are quantized as follows:

// Precompute the sigmoid values for all possible 8-bit integers
std::vector<float> sigmoid_table(256);
for (int i = 0; i < 256; ++i) {
  float x = (i - 128) / 128.0;  // Map int8 to the range of [-1, 1)
  sigmoid_table[i] = 1 / (1 + exp(-x));
}

// During inference, use the quantized value to index into the lookup table
uint8_t quantized_value = ...; // Some quantized value
float sigmoid_result = sigmoid_table[quantized_value];

Topic		Replies	Views
How does full integer quantization work? Micro tflite , help_request	1	1155	March 8, 2024
Quantize Activations. What values are quantized? General Discussion education , help_request	1	881	October 11, 2021
Calculation of quantized dense layers General Discussion tflite	1	384	December 28, 2023
Quantizing activation General Discussion	2	426	October 11, 2021
Post-training quantization. Where to learn more? General Discussion tflite , education , help_request	3	1263	June 5, 2024

Full integer conversion of LSTM cell

Example with LSTM Cell

Related topics