Tier 1 Postpay account getting 100% 429 RESOURCE_EXHAUSTED with no error.details[] for 24h+

Paid Tier 1 Postpay user on the Gemini Developer API — every generateContent call has returned 429
RESOURCE_EXHAUSTED for 24+ hours, and I’ve exhausted every user-side fix. Posting here since Google Cloud Basic
Support can’t take technical cases, and the evidence strongly suggests this is server-side.

Key evidence this is NOT user quota:

  • Response header: x-gemini-service-tier: standard — the paid tier IS being recognized at the service layer.
  • Response body has no error.details[] array naming a quota metric. Normal quota 429s always name the
    exhausted metric; this one doesn’t, which points to anti-abuse or routing-state throttle.
  • models.list on the same key returns 200 OK — auth and project are fully functional.
  • AI Studio Rate Limit dashboard shows peak usage nowhere near limits: 15 / 2000 RPM, 33.8K / 4M TPM, 134 /
    Unlimited RPD.

Account details:

  • Billing: Tier 1 Postpay (Active, budget configured, health checks cleared, linked to project ~24h before
    symptoms started)
  • Generative Language API is enabled on the project
  • API key prefix: AIzaSyBn... (last 4 chars happy to share privately)

Reproduction:

curl -sS -H "Content-Type: application/json" -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$KEY" \
  -d '{"contents":[{"parts":[{"text":"OK"}]}]}'

Returns (HTTP 429):

{
  "error": {
    "code": 429,
    "message": "Resource exhausted. Please try again later.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

Already verified:

  • Fresh curl tests spaced 70 s apart — still 429
  • Minted a brand-new API key in the same project — still 429
  • Tried gemini-2.5-flash — still 429
  • Request rate during testing < 1 RPM

Suspected cause: billing-promotion-to-quota-enforcement lag, or an account-level anti-abuse throttle applied
before my Tier 1 upgrade fully propagated that is now stuck.

Can someone on the Gemini team check whether this project has a hold on it, or force-propagate the Tier 1 quota
state? Happy to share the project ID and last 4 of the API key via DM.

Thanks!

Update 2026-04-21 — narrowed to gemini-2.0-flash specifically. Same API key, same project, same minute.
gemini-2.0-flash is still 429 while every other Gemini model I tried returns 200 on the same key:

Model HTTP Status Notes
gemini-2.0-flash 429 RESOURCE_EXHAUSTED (still, re-confirmed just now) Same
no-error.details[] body as before
gemini-2.5-flash 200 OK modelVersion: gemini-2.5-flash
gemini-flash-latest 200 OK Server resolved to `modelVersion:
gemini-3-flash-preview`

Representative gemini-2.0-flash failure (just now):

{
  "error": {
    "code": 429,
    "message": "Resource exhausted. Please try again later.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

Representative gemini-flash-latest success on the same key seconds later:

{
  "candidates": [{ "content": { "parts": [{ "text": "How can I help you today?" }], "role": "model" },
"finishReason": "STOP" }],
  "usageMetadata": { "promptTokenCount": 1, "candidatesTokenCount": 7, "totalTokenCount": 131,
"thoughtsTokenCount": 123 },
  "modelVersion": "gemini-3-flash-preview"
}

This matrix strongly suggests a stuck or orphaned quota counter scoped to gemini-2.0-flash on project
gen-lang-client-0754187837
— not an account-wide hold, not a user-quota issue (other models served by the
same key, project, and billing tier work fine), and not a request-shape issue (all three curls are byte-identical
aside from the model path segment).

Can someone on the Gemini team look at the gemini-2.0-flash quota bucket for this specific project and clear or
re-sync it? Happy to run any additional variants (regions, -001 pinned version, gemini-2.0-flash-lite, etc.)
that would help.

In the meantime I’ve switched my app to gemini-2.5-flash to unblock. Leaving this open because the stuck
2.0-flash lane will presumably hit other Tier 1 users the same way.

We had this on flash 2.0 and 2.5 on tier 3. They are having an outage (see Google AI Studio ). Our failure rate was about 50%, so watch out. The api should communicate this better, instead of misleading us with 429 errors.

Google’s API has always been unreliable. We just moved to OpenAI on prod in the meantime while we decide what models and providers to settle on. Lesson is to have a fallback that’s not from the same provider.

1 Like

I suggest you to go through this link this will show you which method causes error in flash 2.0,

when i was digging i got to know about gemini method “content generation”.