To the Google AI Team,
I am experiencing a severe backend quota synchronization bug on my Paid Tier 1 account that is completely halting my production environment.
My script is making calls to the gemini-2.5-flash model. My project is linked to an active, paid billing account. According to my Google Cloud API Quota Dashboard, I am well under all limits:
-
Peak RPM: 4 / 1,000
-
Peak TPM: 389.74K / 1,000,000
Despite being at less than 1% of my RPM limit and 39% of my TPM limit, the API is aggressively throwing 429 RESOURCE_EXHAUSTED errors. The server is completely rejecting burst traffic (even as low as 2-3 rapid requests) and forcing a 60 to 120-second timeout before accepting a single new request.
This is clearly the known “Ghost 429” dynamic quota bug affecting Tier 1 accounts, where the edge servers are enforcing an invisible, hyper-restrictive rate limit that ignores the 1,000 RPM / 1M TPM quota displayed on my dashboard.
My Details:
-
Model: gemini-2.5-flash
-
Error: 429 RESOURCE_EXHAUSTED
Please manually investigate my Project ID, lift the probationary/burst throttles on the backend, and sync my actual server allocation to match my Tier 1 dashboard limits so I can resume processing.
Thank you.