Gemini API 429 Error Despite Low Quota Usage on Paid Tier (gemini-2.5-flash)

Hello,

I’m experiencing persistent 429 errors when calling the Gemini API despite having very low quota usage on a paid tier account. This issue prevents my application from working properly.

Error Details:

{"error":{"code":429,"message":"Resource has been exhausted (e.g. check quota).","status":"RESOURCE_EXHAUSTED"}}

Environment:

  • Model: gemini-2.5-flash
  • Project ID: gen-lang-client-0351666068
  • API Endpoint: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent
  • Billing: Paid Tier 1 account

Quota Status (from Cloud Console):
According to the IAM & Admin → Quotas dashboard:

  • GenerateContent input token count limit per model per minute: 1,000,000
    • Current usage: 0.56% (5,637 tokens)
  • Request limit per model per minute (paid tier 1): 1,000
    • Current usage: 0.05% (0.5 requests)

Problem:
Despite extremely low quota utilization (<1%), API calls consistently return 429 errors. The issue persists even after:

  1. Waiting several minutes between requests
  2. Waiting until the next day
  3. Reducing context size and request frequency

Question:
Could this be a known issue where paid tier projects are incorrectly throttled at free-tier limits? Or is there a bug with quota enforcement for gemini-2.5-flash?

Request:
Please investigate if my project is properly configured for paid tier quotas, or if there’s an internal issue causing this behavior.

Thank you!


CRITICAL UPDATE - Detailed Rate Limits Analysis:

I’ve thoroughly investigated the rate limits in AI Studio dashboard:

Current usage over 28 days (gemini-2.5-flash):

  • Peak RPM: 11 / 1,000 (1.1%)
  • Peak TPM: 67.2K / 1M (6.7%)
  • Peak RPD: 131 / 10,000 (1.31%)

Current usage (last 24 hours):

  • RPM: 3 / 1,000 (0.3%)
  • TPM: 48.41K / 1M (4.8%)
  • RPD: 43

The Problem:
Despite using only 1-7% of all rate limits, I continue receiving persistent 429 errors starting from December 14th.

Timeline:

  • December 11: Spike in token usage (~1.8M tokens in one day visible in Overview graphs)
  • December 14: Started receiving 429 errors
  • December 15 (today): Still getting 429 errors despite near-zero usage (0.3-4%)

Hypothesis:

  1. My paid account may be incorrectly throttled at free-tier limits
  2. There’s a cooldown/penalty period after the Dec 11 spike
  3. There’s a hidden daily token quota (not just daily request quota)
  4. System bug marking my account for manual review

What I’ve confirmed:
:white_check_mark: Billing account is active (Paid tier 1)
:white_check_mark: No budget limits configured
:white_check_mark: All costs covered by Free Trial credits (£205 remaining)
:white_check_mark: Current quota usage is extremely low (<10% on all metrics)

Request to Google team:
Please investigate:

  1. Is my project (gen-lang-client-0351666068) correctly configured for paid tier quotas?
  2. Was it flagged after the December 11 spike?
  3. What is the exact DAILY token quota for gemini-2.5-flash on Paid Tier 1?
  4. How long will this restriction last?

This is blocking my production application. Any guidance would be greatly appreciated.

1 Like

same error and im wayyyy below the quota limit

1 Like

I am getting the same issue. My entire platform is down and my client reputation is down the drain. Over that no one is responding to why the issue is happening. Didnt expect this from google. Same paid tier 1, gemini-2.5-flash, 429 errors

1 Like

Hey All,

Thank you for flagging this issue. We apologize for the inconvenience and have escalated it to our internal team for investigation. We will update you as soon as we have more information. Could you please provide the project number (not the project ID) via direct message if you have not yet done so?

@chunduriv

Thank you for looking into this issue. I appreciate your help in investigating why I’m experiencing 429 errors despite low quota usage on my paid tier account.

Looking forward to your findings.

Hi there,

We have been having the same issue with 2.5-flash-preview-09-2025 , we have stopped requests for hours, and checked the rate limits on both AI Studio and Cloud Console (we are nowhere near any limits).

This is affecting two of our projects "Modified by moderator"can you help us look into this? Thanks!

Hey All,

We’ve pushed a fix that should resolve the problem. Please let us know if you are still experiencing any issues.

Thanks for your patience while we sorted this out!

1 Like

@chunduriv

Hello, I have the same issue with gemini-3-flash-preview, TPM limit is not reset after one hour, can’t use the api.

1 Like

We are seeing the same today when using gemini 2.5 flash via API . Our utilization is wildly below rate limit maximums, but we’re still getting 429s. If relevant, we recently (within last 48 hours) upgraded to Tier 3, but these errors are persisting.

1 Like

@chunduriv seeing the same pro 3.0 and flash 3.0 - VERY FAR from hitting our level 3 quotas, sometimes fails from a few request, even the very first requests.

1 Like

same getting 429 from last week

Hi @Andre_Langhorst @CCRyan, @Petr_Bolotin,

Could you please send the full 429 response to help us understand what’s happening?

Thank you!

@chunduriv {“error”:{“type”:“provider”,“reason”:“provider_error”,“message”:“Provider returned 429”,“retryable”:true,“provider”:{“status”:429,“body”:“[{\n \“error\”: {\n \“code\”: 429,\n \“message\”: \“Resource has been exhausted (e.g. check quota).\”,\n \“status\”: \“RESOURCE_EXHAUSTED\”\n }\n}\n]”}}}

the console at the same time showed not even close to quota, sometimes I single request would fail

Getting the same error with every request I make. Haven’t been able to make a single successful request over the last few days. Always the same error. No actual rate limits. Tried switching accounts, same result.

ApiError: {“error”:{“code”:429,“message”:“You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: ``https://ai.google.dev/gemini-api/docs/rate-limits``. To monitor your current usage, head to: ``https://ai.dev/usage?tab=rate-limit``. \n* Quota exceeded for metric: ``generativelanguage.googleapis.com/generate_content_free_tier_requests``, limit: 0, model: gemini-2.5-pro\n* Quota exceeded for metric: ``generativelanguage.googleapis.com/generate_content_free_tier_requests``, limit: 0, model: gemini-2.5-pro\n* Quota exceeded for metric: ``generativelanguage.googleapis.com/generate_content_free_tier_input_token_count``, limit: 0, model: gemini-2.5-pro\n* Quota exceeded for metric: ``generativelanguage.googleapis.com/generate_content_free_tier_input_token_count``, limit: 0, model: gemini-2.5-pro\nPlease retry in 21.54931579s.”,“status”:“RESOURCE_EXHAUSTED”,“details”:[{“@type”:“``type.googleapis.com/google.rpc.Help",“links”:[{“description”:"Learn`` more about Gemini API quotas”,“url”:“``https://ai.google.dev/gemini-api/docs/rate-limits"}]},{“@type”:“type.googleapis.com/google.rpc.QuotaFailure”,“violations”:[{“quotaMetric”:“generativelanguage.googleapis.com/generate_content_free_tier_requests”,“quotaId”:“GenerateRequestsPerDayPerProjectPerModel-FreeTier”,“quotaDimensions”:{“location”:“global”,“model”:“gemini-2.5-pro”}},{“quotaMetric”:“generativelanguage.googleapis.com/generate_content_free_tier_requests”,“quotaId”:“GenerateRequestsPerMinutePerProjectPerModel-FreeTier”,“quotaDimensions”:{“location”:“global”,“model”:“gemini-2.5-pro”}},{“quotaMetric”:“generativelanguage.googleapis.com/generate_content_free_tier_input_token_count”,“quotaId”:“GenerateContentInputTokensPerModelPerMinute-FreeTier”,“quotaDimensions”:{“location”:“global”,“model”:“gemini-2.5-pro”}},{“quotaMetric”:“generativelanguage.googleapis.com/generate_content_free_tier_input_token_count”,“quotaId”:“GenerateContentInputTokensPerModelPerDay-FreeTier”,“quotaDimensions”:{“location”:“global”,“model”:“gemini-2.5-pro”}}]},{“@type”:“type.googleapis.com/google.rpc.RetryInfo”,“retryDelay”:"21s”``}]}}

Hi @Iliannnn,

This error is expected. The Free Tier does not have any quota allocated for the gemini-2.5-pro model. To resolve this and start using the model, you must switch to a paid account (enable billing). This will unlock the necessary quota.

Thank you!

@chunduriv

I don’t think that’s true. I used the model with the Free Tier for quite a while and it’s only recently that I started getting this.

And if you look here in the documentation it is stated under input and output “Free of charge“

@chunduriv we are receiving: “Resource has been exhausted (e.g. check quota)”. Again, we’re on a tier 3 account, so I’m not sure why we’d be seeing this. Looking at the quotas & systems limits pane in GCC, we don’t have any instances of usage above 90% of our quota (nor are we anywhere remotely close to the same). Thanks!

1 Like

Hi @CCRyan,

Could you please share the full 429 response to help us understand what’s happening?

Thank you!

Hi @chunduriv this particular integration is through a vendor, and I don’t have access to the raw logs. I’ll ask them (again) if it’s possible to get them.

The last response I have from them says: “RESOURCE_EXHAUSTED is shown in the response body, along with code: 429 as I mentioned.”

Hi @Iliannnn,

Actually, Google recently implemented significant updates to the Gemini API quotas. As noted in this community thread free tier access for several model, specifically Gemini 2.5 Pro has been removed . The documentation you’re seeing likely hasn’t been updated.

Thank you!