TFLite Micro Add Layer - INT8 models

Sundari_Swathy_Meena · September 30, 2021, 5:07am

The general approach in TFLite Micro (Quantized) for layers such as Convolution, Multiplication is that the operation is performed and the resulting tensor is Scaled/Saturated (SaturatingRoundingDoublingHighMul followed by RoundingDivideByPOT )

However the Add Operation seems to have a different approach.
The input tensors are first scaled/saturated, the resulting tensor after computation is also scaled/saturated.
Any mathematical reason as to why the add layer needs this preprocessing?

The same computational approach is also followed in CMSIS-NN for Add layer.
(arm_nn_requantize function from arm_nnsupportfunctions.h)

Curious to know the mathematical implications behind this.

LK_Kadali · November 22, 2024, 5:06pm

Hi @Sundari_Swathy_Meena,

Sorry for the delayed response. The main reason to scaled the input tensor before add operation is to avoid numerical overflow and underflow in TFLite Micro and CMSIS-NN, especially working with low-precision fixed-point arithmetic. But in case of multiplication the operand is already a quantized_multiplier.

RoundingDivideByPOT(SaturatingRoundingDoublingHighMul(
                                 x * (1 << left_shift), quantized_multiplier),
                             right_shift);

Thank You

Topic		Replies	Views
TFLite Batchnorm layer : Qunatized models General Discussion models , tflite , help_request	1	899	July 27, 2024
Calculation of quantized dense layers General Discussion tflite	1	388	December 28, 2023
How does full integer quantization work? Micro tflite , help_request	1	1156	March 8, 2024
How are Conv layer (and others layers) outputs quantized? General Discussion tflite , help_request	1	767	September 5, 2021
Specify precision in TFLite models TensorFlow models , tflite , help_request	1	743	January 2, 2024

TFLite Micro Add Layer - INT8 models

Related topics