Hey,
Starting 17th November, I’ve noticed extremely high latencies on my 1.5 Flash API. Response times have shot up from 14s to ~ 80s. Any pointers on why this is happening? Is the Pro model’s API also affected? I read that it is due to Google allocating resources to a new experimental model. If so, when will the latency go back to normal?
Yes, I’ve noticed this too. Even simple text generations take alot of time. The latency has almost 10x for me
I’m also having the same issue (free tier).
A few days ago, I began experiencing significant delays in Gemini-1.5-Flash API responses. Requests that previously took 2-3 seconds are now taking 60-90 seconds. These extended response times make my application unusable.
In contrast, similar requests on Google AI Studio continue to receive responses at their previous speed.
I’ve also experienced higher latency in “gemini-1.5-flash-002” than the “gemini-1.5-flash-001” model. Their documentation says that latency is reduced by 3x, but actually it’s the vice-versa! Many time, “gemini-1.5-flash-002” model doesn’t give the response for 10 minutes and then break with “500 Internal server error” and this is the worst thing anyone can expect in production.