Hi ,
When using CMSIS NN for float32 model, I cannot see any performance jump.
But while using the quantized model, the performance jump is around 4x.
I would like to understand why this is happening for float32 model ??
Hi ,
When using CMSIS NN for float32 model, I cannot see any performance jump.
But while using the quantized model, the performance jump is around 4x.
I would like to understand why this is happening for float32 model ??
Hi @Mohanish_Nehete,
Basically, CMSIS-NN library follows the int8 and int16 quantization specification of TensorflowLite for Microcontrollers. As the library is specifically optimized for ARM-cortex-M processor, it is obvious that the int8 quantized model has significant performance improvement compared to Float32 model. When flaot32 quantized model is deployed on this hardware, all the ops are not compatible hence degrades the performance.
Thank You