Problem
Our project (Paid Tier 2, gemini-2.0-flash) consistently returns HTTP 429 RESOURCE_EXHAUSTED on the very first API request of each execution, starting 2 days ago. GCP Console quota page shows all relevant quotas at 0% usage, but actual calls fail with 429.
What I’ve tested
| Setup | Result |
|---|---|
| Original GCP project (Paid Tier 2) + original API key | 429 |
| Same project + brand new API key (no restrictions) | 429 |
| Brand new GCP project (Paid Tier 1) + brand new key | 429 |
All 3 combinations fail with 429 on the first request — strongly suggests this is account-level, not project-level.
Quota status (verified on the project)
For gemini-2.0-flash:
- RPM Paid Tier 2: 10,000 (0% usage)
- TPM input Paid Tier 2: 3,000,000 (0% usage)
- RPD: unlimited (0% usage)
API keys: no IP / Referrer / Application restrictions set.
Reproduction details
- Caller: Google Apps Script via
UrlFetchApp - Model:
gemini-2.0-flash - Per-request token size: ~30,000-50,000
- First request fails immediately with 429
- Retries (3x with 20s delay) sometimes succeed after retry, sometimes all 3 fail
- Same key from local
curlworks fine (10x burst at 40k tokens, all 200)
What I’d like to know
- Why does a Tier 2 paid project hit 429 immediately when GCP Console shows 0% usage for the corresponding project/model?
- Are there account-level quotas that aren’t visible in the per-project quota page?
- Why does the same key work from local
curlbut fail from GASUrlFetchApp? Is there a different quota bucket or rate limit when called from Google’s own infrastructure? - The 429 response body doesn’t include
quotaMetricdetails. How can I identify the exact quota being violated?
Any guidance appreciated.