I’m running Gemini 2.5 Flash in production (for multiple apps) and I keep hitting this error:
503 - The model is overloaded. Please try again later.
This makes my apps unusable for end users, since requests randomly fail. From what I’ve seen, this issue is not just on my side, many other developers are running into the same problem. AND ITS NOT ABOUT RATE LIMIT. The probleme appeared approximately one month ago.
It’s extremely disruptive for production workloads, and it would be great if this could be prioritized and fixed as soon as possible.
Is there any official update or timeline on when this will be resolved?
Any plans on looking on this error? More than one week having the same error with Gemini 2.5 pro (of course, not the free tier). Already reported that on github and no responses °_°
No repsonse at all. I tried this forum, Google dev discord, tagging Google devs on X, nobody cares. And its geting worse since they are pushing new models using more compute
Hi We are facing issue google.genai.errors.ServerError: 503 UNAVAILABLE. {‘error’: {‘code’: 503, ‘message’: ‘The model is overloaded. Please try again later.’, ‘status’: ‘UNAVAILABLE’}}. Kindly advise on further steps
i am also facing the same issue even for the small requests it does this and if it is half complete it consumes tokens and the remaining half failed it is way to frustrating i tried retry various times but it is resulting the same issue please do check
I’m facing the same issue with AI Studio trying to use 2.5 Flash.
It’s quite sad really, because my tests were great and now, when I want to use it with real data, it just 503th itself…
Should I just wait?