Very small values of weights afftects inference time on board i500?

Hi everyone.
I have issue when working with Tflite model. My original Tensorflow model has 2 layers with very small weights (-1.3e-30, 1.3e-30) and when convert this model to Tflite, the quantization parameters for these layers like this: -1.3e-30 < scale (q - 128) < 1.3e-30. And the inference time of this Tflite model on board i500 is large, abnormal comparing with other model. I think the problem is with very small values, may be in board i500 is needed to be process underflow.

Could you give me any advice, please?

Hi @Hot_Ice,

As the weights of the model are very small, the post training quantization further reduces the weights. while inferencing on your hardware device(i500) it leads to underflow that effects the performance and inference time as well. You can try with Quantization Aware Training, QAT where the weights are properly quantized during training.

Thank You