Custom quantization aware training with lround during int8 multiplications

Stshalson · October 26, 2022, 1:57am

Hello,
I have simple fully connected model with following architecture:
Input(10) → Dense(10) → Dense(10) → Dense(10) → Dense(10) → Dense(1)
I performed integer 8 QAT on it, and tflite inference.

I want to perform QAT with int8 but with some changes to inference.

For example current 8bit inference looks like this:

And i want to change it to this:

If i do inference with numpy that looks like second image accuracy drops so i want to do QAT that takes all those changes into account, am i able to do it?

Thank you

Topic		Replies	Views
QAT training: convert input/output to 8 bits instead of float32 General Discussion models , keras , tflite	7	1417	May 16, 2023
Quantization aware training -> In/Output still float32? General Discussion models , keras , model_optimization , help_request	1	1597	May 13, 2022
Quantization aware training with quantizationConfig -> 4 % Accuracy loss General Discussion models , keras , model_optimization , help_request	1	1681	April 25, 2022
Using tflite model for inference using python General Discussion models , datasets , tflite , help_request	1	1311	October 28, 2021
QAT to TFLite failed with ValueError TensorFlow tflite-support , model_optimization	2	292	February 9, 2024

Custom quantization aware training with lround during int8 multiplications

Related topics