Quantization of a CNN with 4 bit

claudio137 · February 12, 2025, 1:44pm

Hi, I need to quantize a small cnn. After the training I would like to see weights and bias quantized with 4 bit precision. I’m using Tensorflow model optimization but I always see floating point at the end like many other libraries. With Tensorflow lite I can see 8 bit precision for weights while bias remaining 32 bit.

Can you help me suggesting a way to solve this problem? Any help is welcome.

Thank you so much for your attention.

Kiran_Sai_Ramineni · February 13, 2025, 6:54am

Hi @claudio137, At present 4 bit qutization is not available in tensorflow. Only int8, float16 are minimum quantization supported. Thank You.

Topic		Replies	Views
Quantizing the models using tflite General Discussion models , keras , tflite , post-training	2	370	January 4, 2024
Is it possible to use 8 bit integers instead of floating point numbers? General Discussion model_optimization , help_request	1	1105	April 20, 2022
How are Conv layer (and others layers) outputs quantized? General Discussion tflite , help_request	1	785	September 5, 2021
4 bit quantization aware training General Discussion model_optimization , training , help_request	1	2081	October 19, 2021
Quantization aware training -> In/Output still float32? General Discussion models , keras , model_optimization , help_request	1	1611	May 13, 2022

Quantization of a CNN with 4 bit

Related topics