Gemini Image API: Frequent 503 “model is overloaded” errors for 4K image generation in production

We are using the Gemini Image API in a production SaaS platform.

When generating high-resolution images (2K–4K), we frequently receive:

503 – “The model is overloaded. Please try again later.”

This happens even at low concurrency (1 request at a time).
The same request sometimes works locally or at different times of day, which suggests capacity variability.

Here is a screenshot from our production logs:
(attach the same screenshot you sent to support)

Before redesigning our pipeline, we would like official clarification on:

  1. Is 4K image generation considered best-effort rather than guaranteed?
  2. What resolution is recommended for stable production usage (e.g., 1024px / 1536px)?
  3. Does upgrading the service tier improve queue priority for high-resolution image requests?

Any guidance from the Gemini engineering team would be greatly appreciated.

Hi @reedo, welcome to the community!

  1. 4K generation is considered a supported capability, but its availability is governed by Dynamic Capacity Management. Because 4K output requires significantly higher compute resources and longer reasoning cycles (part of the Gemini 3 “Thinking” process), the API may return a 503 Service Unavailable error during peak periods even if you have not exceeded your RPM/TPM limits. Gemini Error Codes .
  2. For SaaS platforms requiring consistent low-latency responses, it’s recommended to use Standard High Definition rather than Ultra HD (4K). Media Resolution
  3. Upgrading from Tier 1 to Tier 3 increases your rate limits (RPM/TPM), but more importantly, it grants access to Provisioned Throughput options.

Based on these, you can try implementing retry logic with Exponential Backoff for handling 503 errors and also have a resolution fallback to a lower (2k) resolution if the request fails multiple times.

Thank you!