.tflite model - improve latency of Google Cloud bucket retrieval object for Inference

I 'm following the Image Classification code at MediaPipe.

My model runs successfully, it is smaller than EfficientNet. However the latency of inference I get is 5 seconds.

MediaPipe .tflite object, at:

[https://storage.googleapis.com/mediapipe-models/image_classifier/efficientnet_lite0/float32/1/effici...](https://storage.googleapis.com/mediapipe-models/image_classifier/efficientnet_lite0/float32/1/efficientnet_lite0.tflite)

… loads instantly, with an inference time of milliseconds.

I tried with a regional bucket and fine-grained permissions but it didn’t solve the problem. I am also using the https://storage.googleapis.com/xxxxxxxx/model.tflite Public URL but it didn’t solve the problem. CORS file of the bucket is configured as the following:

[
{
“origin”: [“https://your-example-website.appspot.com”],
“method”: [“GET”],
“responseHeader”: [“Content-Type”],
“maxAgeSeconds”: 1
}
]

Do you have any ideas how to improve object retrieval latency to milliseconds?