When I first run object detect example app with provided model (detect.tflite) , it has almost 90 classes with inference time was 30ms…
but I trained with small number of image like 100~200 images and I just have 3 class.
I used mobile net v2 640x640 and didn’t find tune
what make inference time smaller ?
I quantize with float so, my model size is reduced to 4mb from 11mb…
but as to inference time, it remain same.
This is hard to answer with only this information.
Are you running the model on a phone? is it using GPU? are you using multi-thread?
Usually what I did in the past to find out why a model is slow on my app was to run the benchmark tool: Performance measurement | TensorFlow Lite
one thing that could help is decreasing the image input size. 640x640 is quite big and object detection is a complex task by itself.