And when requests do succeed, latency is noticeably worse than before — simple prompts that used to return in under a second are now taking 5–10s.
I’m well within rate limits so it’s not a quota thing. Retries help sometimes but not always. Is this a known issue on the backend? Anyone else seeing this?
I also got similar issue I used to check gemini 3.1 flash lite preview speed like month ago and it was comparable with gemini 2.5 flash lite, around 1.5s for my simple task. Now it’s 3s for gemini 3.1 flash lite preview
Gemini hasn’t yet upscaled the resource allocation for this model. It’s still in preview so all we can do is hope the team can work on this. I’ve been getting 503’s consistently for maybe 20 hours, specifically with 3.1 flash lite model.