How are Conv layer (and others layers) outputs quantized?

George_Nardes · September 4, 2021, 4:00pm

TensorFlow documentation says that during the inference step the weights and triggers are int8 and the bias is int32. The result of the convolution is added to the bias and then the result of the sum is converted from int32 bits to int8. This conversion is not clear to me. I think I would have some options for that. For example, we can use only the most significant or least significant bits to generate the output value in int8, or we can reduce the value and then convert to int8. So how is this conversion done?
I am developing an FPGA accelerator for CNN as a graduation work. I used Quantization Aware Training to get the 8-bit weights. I have extracted the quantized weights from the TFLite interpreter and am validating the weights in C before implementing in VHDL. However, the results aren’t correct and I think I’m missing something when converting the output of Conv Layer from int32 to int8.

Preformatted textchar o = (char)(conv_o + l.bias[m]); // add bias
Preformatted texto = (o < 0) ? 0 : o; // RELU

I hope you can help-me.
Thank you!
Best regards.

battery · September 5, 2021, 11:29pm

Adding @Jaehong_Kim for the visibility.

Topic		Replies	Views
Extract Quantization parameters used for convolution layer weights and perform layer by layer inference in C++ General Discussion cpp , custom-layer	0	82	October 21, 2024
QAT training: convert input/output to 8 bits instead of float32 General Discussion models , keras , tflite	7	1479	May 16, 2023
Post-training quantization. Where to learn more? General Discussion tflite , education , help_request	3	1271	June 5, 2024
Calculation of quantized dense layers General Discussion tflite	1	399	December 28, 2023
Quantization aware training -> In/Output still float32? General Discussion models , keras , model_optimization , help_request	1	1619	May 13, 2022

How are Conv layer (and others layers) outputs quantized?

Related topics