Hi,
I am using TF v2.14 and tfmot v0.7.5
I am trying to apply QAT on my trained model and then convert the QAT model to TFLite. So for this, I wrote the following code:
converter = tf.lite.TFLiteConverter.from_keras_model(qat_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8
quantized_tflite_model = converter.convert()
with open(self.output_model_path, 'wb') as f:
f.write(quantized_tflite_model)
But I got the following error:
ValueError: The inference_input_type and inference_output_type must be tf.float32.
This is for full_integer TFLite.
Any explanation why I am getting this error?
Thanks,
The root cause of the error you’re encountering is that TensorFlow Lite’s converter does not support setting both inference_input_type
and inference_output_type
to tf.int8
for full integer quantization directly through the TFLiteConverter
API when converting a Quantization Aware Training (QAT) model. The converter expects the input and output types to be in floating-point format (tf.float32
) by default for compatibility with most use cases, where quantization and dequantization are handled internally.
Hi @MLEnthusiastic,I have tried to convert a QAT model to tflite having inference input and output type as int8 and did not face any error. Could please follow the similar way for converting input and output inference to int8 as shown in this gist. If you still face the same error, please provide the stand alone code to reproduce the issue. Tahnk You.