Persistent 503 Server Overloaded errors on gemini-3.1-flash-image-preview – Tier 1 Paid Account

Hi everyone,

I’ve been experiencing persistent 503 “Server Overloaded” errors when calling the gemini-3.1-flash-image-preview model via the Google AI Studio API. This is significantly impacting a production SaaS application.

Setup:

  • Model: gemini-3.1-flash-image-preview
  • Access: Google AI Studio, Tier 1 with active billing account
  • Error: 503 – The model is overloaded. Please try again later.

Impact:

  • End users of our SaaS are experiencing failures and degraded responses
  • The errors are not occasional — they are frequent and ongoing, making the service unreliable for production use
  • We already implemented exponential backoff, but requests still fail consistently

What I’ve tried:

  • Exponential backoff with up to 5 retries
  • Reducing request concurrency
  • Checking the API status page

Questions:

  1. Is there a known incident affecting gemini-3.1-flash-image-preview capacity right now?
  2. Is there an ETA for stabilization?
  3. For Tier 1 paid users, is there any priority queue or escalation path?
  4. Would migrating to Vertex AI improve availability for this specific model?

Any guidance from the Google team or community would be greatly appreciated.