TFLite Batchnorm layer : Qunatized models

Sundari_Swathy_Meena · September 30, 2021, 4:54am

It is seen that the Batchnorm layer appears as Mul and Add nodes after conversion; However if the Mul and Add functionalities are replicated using gamma, beta and input tensors, there seems to be a discrepancy for quantized models. Is there any additional computation apart from these two that is causing this issue?
The FP32 model seems to work just as fine when all parameters are extracted and used as Batchnorm layer , the same is not true for INT8/UINT8 models.

LK_Kadali · July 27, 2024, 11:20am

Hi @Sundari_Swathy_Meena,

The difference you are observing between your FP32 and int8/uint8 models after converting batch normalization layers is due to the quantization process employed in TFLite conversion. The FP32 models usually have high precision compared to the quantized models(int8/uint8). The quantization process involves scaling and shifting the floating point numbers to fit within the 8-bit range which introduces quantization errors. Please refer to the quantization documentation. However if the difference is large between the models, please share the reproducible code to inspect the possible reasons.

Thank you

Topic		Replies	Views
TFLiteConverter adds (de)quantization blocks before(and after) operations on a weight variable General Discussion tflite , help_request	2	603	February 18, 2023
QAT training: convert input/output to 8 bits instead of float32 General Discussion models , keras , tflite	7	1417	May 16, 2023
Quantization aware training -> In/Output still float32? General Discussion models , keras , model_optimization , help_request	1	1598	May 13, 2022
Tflite batchnorm implementation General Discussion tflite	1	663	December 12, 2023
BatchNormalization Operation in TFLite General Discussion models , keras , tflite , help_request	1	2761	July 23, 2021

TFLite Batchnorm layer : Qunatized models

Related topics