CMSIS NN for float32 model

Mohanish_Nehete · August 6, 2021, 11:20am

Hi ,

When using CMSIS NN for float32 model, I cannot see any performance jump.

But while using the quantized model, the performance jump is around 4x.

I would like to understand why this is happening for float32 model ??

LK_Kadali · August 31, 2024, 10:20am

Basically, CMSIS-NN library follows the int8 and int16 quantization specification of TensorflowLite for Microcontrollers. As the library is specifically optimized for ARM-cortex-M processor, it is obvious that the int8 quantized model has significant performance improvement compared to Float32 model. When flaot32 quantized model is deployed on this hardware, all the ops are not compatible hence degrades the performance.

Thank You

Topic		Replies	Views
Performance issues with tflite quantized models on ARM Cortex-A73 CPU General Discussion models , tflite	1	97	July 31, 2024
Performance getting decreased on Fine-tuned (QAT) model TensorFlow tflite-support , model_optimization	3	328	February 6, 2024
Quantized version of tflite model does slower inferences on Android phones General Discussion models , android , tflite , help_request	1	971	December 13, 2022
TFLite Batchnorm layer : Qunatized models General Discussion models , tflite , help_request	1	890	July 27, 2024
TFlite conversion is slow on raspi4 General Discussion tflite , raspberry_pi , help_request	5	1437	December 8, 2023

CMSIS NN for float32 model

Related topics