Hi everyone,
I’ve been experiencing persistent 503 “Server Overloaded” errors when calling the gemini-3.1-flash-image-preview model via the Google AI Studio API. This is significantly impacting a production SaaS application.
Setup:
- Model:
gemini-3.1-flash-image-preview - Access: Google AI Studio, Tier 1 with active billing account
- Error:
503 – The model is overloaded. Please try again later.
Impact:
- End users of our SaaS are experiencing failures and degraded responses
- The errors are not occasional — they are frequent and ongoing, making the service unreliable for production use
- We already implemented exponential backoff, but requests still fail consistently
What I’ve tried:
- Exponential backoff with up to 5 retries
- Reducing request concurrency
- Checking the API status page
Questions:
- Is there a known incident affecting
gemini-3.1-flash-image-previewcapacity right now? - Is there an ETA for stabilization?
- For Tier 1 paid users, is there any priority queue or escalation path?
- Would migrating to Vertex AI improve availability for this specific model?
Any guidance from the Google team or community would be greatly appreciated.