Tensorflow lite inference time

vanduong0504 · July 23, 2021, 4:09am

I tried to convert my Pytorch models to TensorFlow Lite with ONNX. But my inference time from TensorFlow Lite is twice as slow as Tensorflow and Pytorch. I run TensorFlow Lite model in google colab and this is my first time using TensorFlow Lite.

Here is my code to convert from Tensorflow to TensorFlow Lite:

converter = tf.lite.TFLiteConverter.from_saved_model("model/")
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_types = [tf.float16]
model_lite = converter.convert()

with open('model.tflite', 'wb') as f:
    f.write(model_lite)

I used time module from Python to measure the latency of frameworks. I don’t know why my Lite version is slower than the others. Any suggestions will help me a lot.

Sayak_Paul · July 23, 2021, 5:29am

TFLite is not meant to perform good on the commodity hardware actually. It’s opset is optimized to run faster primarily on mobile hardware. However, if you build TFLite for your platform (preferably with XNNPACK enabled) then you may get some benefits.

Here’s some more information:

vanduong0504 · July 23, 2021, 6:32am

So when I deploy it mobile this problem may fix? I am testing it with colab but will deploy it to mobile app in the future.

Sayak_Paul · July 23, 2021, 6:45am

Yes. You should test it on a real mobile device to set the expectations right.

You could also set benchmarks on Firebase Test Labs:

vanduong0504 · July 23, 2021, 6:50am

Thanks for your help.

Topic		Replies	Views
TFlite conversion is slow on raspi4 General Discussion tflite , raspberry_pi , help_request	5	1437	December 8, 2023
Why is my custom model with mobile netv2 is so slow in inference time? General Discussion models , tflite , model_garden , help_request	2	1659	July 31, 2021
Model Maker TF Lite Slow Inference General Discussion models , tflite , model_maker , help_request	11	5472	September 6, 2021
Quantization : slower inference on Android phone TensorFlow android , help_request	2	1041	June 3, 2022
Quantized version of tflite model does slower inferences on Android phones General Discussion models , android , tflite , help_request	1	986	December 13, 2022

Tensorflow lite inference time

Related topics