Extreme latency on gemini-1.5-flash API

2cheeze4u · November 19, 2024, 7:00pm

Hey,
Starting 17th November, I’ve noticed extremely high latencies on my 1.5 Flash API. Response times have shot up from 14s to ~ 80s. Any pointers on why this is happening? Is the Pro model’s API also affected? I read that it is due to Google allocating resources to a new experimental model. If so, when will the latency go back to normal?

Santosh_R · November 20, 2024, 11:48am

Yes, I’ve noticed this too. Even simple text generations take alot of time. The latency has almost 10x for me

tull_wood · November 20, 2024, 12:55pm

I’m also having the same issue (free tier).
A few days ago, I began experiencing significant delays in Gemini-1.5-Flash API responses. Requests that previously took 2-3 seconds are now taking 60-90 seconds. These extended response times make my application unusable.

In contrast, similar requests on Google AI Studio continue to receive responses at their previous speed.

urvisism · January 6, 2025, 6:29am

I’ve also experienced higher latency in “gemini-1.5-flash-002” than the “gemini-1.5-flash-001” model. Their documentation says that latency is reduced by 3x, but actually it’s the vice-versa! Many time, “gemini-1.5-flash-002” model doesn’t give the response for 10 minutes and then break with “500 Internal server error” and this is the worst thing anyone can expect in production.

Topic		Replies	Views
Unexpected Delay in Gemini-1.5-Flash API Responses Gemini API gemini-15 , api	2	731	November 21, 2024
Issues with Gemini 1.5 Flash API Performance Google AI Studio gemini-15 , api , models	1	264	July 3, 2025
Response time for Gemini API Gemini API models , python	5	1172	December 13, 2024
Slow response from Gemini 2.0 Flash Experimental Google AI Studio gemini-flash	11	1405	March 1, 2025
Latency problems API gemini 2.0 flash multimodal life Gemini API api , audio , gemini-flash , gemini-20	2	142	March 25, 2025

Extreme latency on gemini-1.5-flash API

Related topics