BUG: 429 RESOURCE_EXHAUSTED on Paid Tier 1 (Gemini 2.5 Flash) despite <1% Quota Usage

To the Google AI Team,

I am experiencing a severe backend quota synchronization bug on my Paid Tier 1 account that is completely halting my production environment.

My script is making calls to the gemini-2.5-flash model. My project is linked to an active, paid billing account. According to my Google Cloud API Quota Dashboard, I am well under all limits:

  • Peak RPM: 4 / 1,000

  • Peak TPM: 389.74K / 1,000,000

Despite being at less than 1% of my RPM limit and 39% of my TPM limit, the API is aggressively throwing 429 RESOURCE_EXHAUSTED errors. The server is completely rejecting burst traffic (even as low as 2-3 rapid requests) and forcing a 60 to 120-second timeout before accepting a single new request.

This is clearly the known “Ghost 429” dynamic quota bug affecting Tier 1 accounts, where the edge servers are enforcing an invisible, hyper-restrictive rate limit that ignores the 1,000 RPM / 1M TPM quota displayed on my dashboard.

My Details:

  • Model: gemini-2.5-flash

  • Error: 429 RESOURCE_EXHAUSTED

Please manually investigate my Project ID, lift the probationary/burst throttles on the backend, and sync my actual server allocation to match my Tier 1 dashboard limits so I can resume processing.

Thank you.

To the Google AI Team,

I am experiencing a severe backend quota synchronization bug on my Paid Tier 1 account that is completely halting my production environment.

My script is making calls to the gemini-2.5-flash model. My project is linked to an active, paid billing account. According to my Google Cloud API Quota Dashboard, I am well under all limits:

  • Peak RPM: 4 / 1,000

  • Peak TPM: 389.74K / 1,000,000

Despite being at less than 1% of my RPM limit and 39% of my TPM limit, the API is aggressively throwing 429 RESOURCE_EXHAUSTED errors. The server is completely rejecting burst traffic (even as low as 2-3 rapid requests) and forcing a 60 to 120-second timeout before accepting a single new request.

This is clearly the known “Ghost 429” dynamic quota bug affecting Tier 1 accounts, where the edge servers are enforcing an invisible, hyper-restrictive rate limit that ignores the 1,000 RPM / 1M TPM quota displayed on my dashboard.

My Details:

  • Model: gemini-2.5-flash

  • Error: 429 RESOURCE_EXHAUSTED

Please manually investigate my Project ID, lift the probationary/burst throttles on the backend, and sync my actual server allocation to match my Tier 1 dashboard limits so I can resume processing.

Thank you.