Dynamic quantization

Hi all,

Judging by the output below, dynamic quantization is not supported in TFLite-micro?

MCU log:
Input type: FLOAT32 with filter type : INT8 not supported.
Node FULLY_CONNECTED (number 6f) failed to prepare with status 1

Is it sensible to try and implement it in TFLite?

My MCU has HW floating point and models without any optimizations work just fine. Is there any reason to apply dynamic quantization when floating point unit is available?

PS. I know that there are three questions above, but I am actually trying to understand one thing. Should I pursue the dynamic quantization.

Hi, @Sreten_Jovicic
I apologize for the delay in my response, thank you for bringing this issue to our attention the error "Input type: FLOAT32 with filter type : INT8 not supported" definitively confirms that TensorFlow Lite Micro does not support dynamic range quantization . This is a fundamental architectural limitation not a configuration issue.

This is because the hardware targets using TFLite-Micro rarely have hardware floating point support so it becomes prohibitively expensive to use a floating point model in production. We currently only support float32 with reference kernels. We generally recommend using full integer quantization. If applicable for your use case

TensorFlow Lite Micro only supports specific quantization combinations:

  • Full integer quantization (INT8/UINT8 for both weights and activations)
  • Float32 operations with reference kernels
  • No mixed-precision operations (INT8 weights with FLOAT32 activations)

Dynamic quantization requires runtime conversion between INT8 weights and FLOAT32 activations which TFLite-Micro deliberately excludes to maintain minimal memory footprint and deterministic execution paths so recommended alternatives either full integer quantization (for maximum memory efficiency) or optimized FLOAT32 inference (for maximum performance with FPU hardware). If I have missed something here please let me know

Thank you for your cooperation and patience.