Finding that my seq2seq model (with dynamic input sizes) runs well in the tflite benchmark, and with the python tflite interpreter, but on mobile with the C API it fails with the following error when the input size becomes too large.
tflite/tensorflow/tensorflow/lite/kernels/batch_matmul.cc:466 scaling_factor_size >= num_batches_to_quantize was not true.
The model was generated from tensorflow using dynamic quantization (the default), the issue does not appear when doing no quantization or strict int8 quantization.