Quantization aware training -> In/Output still float32?

Horst_G · October 30, 2021, 9:35pm

Hey,
First, the post-quantization works fine! Now, I want to try the quantization aware training to convert it to int8. Therefore, I followed the basic quantization aware training tensorflow tutorial. Basically, I have the following code

import tensorflow_model_optimization as tfmot

model = keras.Sequential([
  keras.layers.InputLayer(input_shape=(28, 28)),
  keras.layers.Reshape(target_shape=(28, 28, 1)),
  keras.layers.Conv2D(filters=12, kernel_size=(3, 3), activation='relu'),
  keras.layers.MaxPooling2D(pool_size=(2, 2)),
  keras.layers.Flatten(),
  keras.layers.Dense(10)
 ])

quantize_model = tfmot.quantization.keras.quantize_model
q_aware_model = quantize_model(model)
q_aware_model.compile(optimizer='adam',
          loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
          metrics=['accuracy'])

converter = tf.lite.TFLiteConverter.from_keras_model(q_aware_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
quantized_tflite_model = converter.convert()

 interpreter = tf.lite.Interpreter(model_content=quantized_tflite_model)

Looking at e.g. interpreter.get_input_details()[0].dtype gives me np.float32 (and does not have any quantization parameters), although full integer post-quantization correctly shows dtype np.uint8 with corresponding parameters. So the in- and output parameters and the entire model appear to be still in float32. What is the problem?

Rino_Lee · May 13, 2022, 11:45am

HI,

The QAT API “simulate” the int8 quantization so that the input/output is float32 but still alternating within range represented by 8 bits.

If you want to get real int8 network with QAT, you need to convert the tf graph to tflite model as in Quantization aware training comprehensive guide | TensorFlow Model Optimization

Topic		Replies	Views
QAT training: convert input/output to 8 bits instead of float32 General Discussion models , keras , tflite	7	1417	May 16, 2023
QAT to TFLite failed with ValueError TensorFlow tflite-support , model_optimization	2	292	February 9, 2024
Quantization aware training with quantizationConfig -> 4 % Accuracy loss General Discussion models , keras , model_optimization , help_request	1	1681	April 25, 2022
No model size reduction in Tflite model size with integer Quantisation General Discussion models , keras , tflite , model_optimization , help_request	6	2114	July 7, 2021
Custom quantization aware training with lround during int8 multiplications TensorFlow models	0	576	October 26, 2022

Quantization aware training -> In/Output still float32?

Related topics