We are at Tier 3.
Model: gemini-3-flash-preview
Billing: AI Studio shows Tier 3; billing was recently moved from Europe to US.
Workload: explicit cachedContents, response_schema, thinking_level=MINIMAL, media_resolution=LOW, PDF inline_data.
Observed issue: visible quotas appear not exhausted, but calls hit 429 RESOURCE_EXHAUSTED.
Error body: {“code”:429,“message”:“Resource has been exhausted (e.g. check quota).”,“status”:“RESOURCE_EXHAUSTED”}
Could you please help us about this issue?
Hello @Enes_Yalcin ,
To help us investigate and resolve this Tier 3 rate limiting issue:
- Please send us a Direct Message (DM) on the forum with your Google Cloud Project Number or Billing Account ID.
- (To send a DM: Click on our profile icon on the forum post, and select the Message button).
A quick debugging note on your workload: Since you are utilizing cachedContents (Context Caching) with PDF inline data, please note that Context Caching has its own independent token and concurrency limits. Even if your standard Model RPM/TPM quotas are far below peak, exceeding the Caching limits will trigger 429 RESOURCE_EXHAUSTED errors.
Once we have your Project Number, we will inspect your billing migration logs and verify if your sync was affected, or if you are hitting the Context Caching limits!