I am doing QAT and then full integer conversion in TFLite. I realize that for some reason TFLite requires that 0.0 to be always in the tensor/weight’s [min, max] range. This is commented in quant_ops.py and tf.quantization.
I wonder what TFLite forces min <= 0 <= max
? I have encountered cases that weights are all positive (or negative) and observed significant loss in quantization accuracy. Is there any way to work around it?
Hi @Wenjie_Lu, Generally while converting a model to tflite with integer quantization, For example int8 quantization the range is expected to be from +127 to -128 because it will be convenient to represent signed floating point weights to singed integer values. By converting -ve floating point to -ve integer and +ve floating point to +ve integer, the weight range will be the same after applying integer quantization. So there won’t be much reduction in accuracy.
Even though your model has all the positive weights it won’t bring much change in accuracy. If you face the significant loss in quantization accuracy could you share the difference in accuracy. Thank You.