Over 300k context tokens lead to 429 error for both Gemini-pro and flash on free tier

Alex_K · August 2, 2025, 7:18pm

I notice quite weird behavior on free tier both gemini-pro and gemini-flash models. Everything works well until the context size extends 300k tokens. After that moment I am constantly getting errors like this:

google.genai.errors.ClientError: 429 RESOURCE_EXHAUSTED. {'error': {'code': 429, 'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.', 'status': 'RESOURCE_EXHAUSTED', 'details': [{'@type': 'type.googleapis.com/google.rpc.QuotaFailure', 'violations': [{'quotaMetric': 'generativelanguage.googleapis.com/generate_content_free_tier_input_token_count', 'quotaId': 'GenerateContentInputTokensPerModelPerMinute-FreeTier', 'quotaDimensions': {'location': 'global', 'model': 'gemini-2.5-flash'}, 'quotaValue': '250000'}]}, {'@type': 'type.googleapis.com/google.rpc.Help', 'links': [{'description': 'Learn more about Gemini API quotas', 'url': 'https://ai.google.dev/gemini-api/docs/rate-limits'}]}, {'@type': 'type.googleapis.com/google.rpc.RetryInfo', 'retryDelay': '3s'}]}}

If for gemini-pro this error might seem reasonable, as it has 250,000 TPM limitation, for gemini-flash 1,000,000 TPM is set, so everything should work fine, yet don’t.

GUNAND_MAYANGLAMBAM · August 5, 2025, 4:21am

Just wondering, how are you building up the context? Is it through a multi-turn text conversation or are you filling the context by uploading large files or documents?

Alex_K · August 5, 2025, 8:40am

@GUNAND_MAYANGLAMBAM
Building projects through conversations with code using my own another project: GitHub - volotat/InsightCoder

It tries to fit all the code into the context, as well as compressed history of passed conversation. So I know for sure that the model has everything I need in the current context. This gives me continuity and precise control over the generated code. No automatic applyings, everything goes through my eyes after careful consideration. This allows me to avoid the main downfall of the “vibe coding” - error accumulation. It is actually quite a common occurrence when the model tries to do something really not good, and this tool allows me to keep it straight on the road. For me this approach works exceptionally well.

Pannaga_J · August 12, 2025, 5:56am

Hey based the json response body you have shared you are using 2.5 Flash which has a TPM limit of 250,000 . Please try 2.0 Flash since it has TPM limit of 1,000,000 and let us know if you are still facing the issue.

Alex_K · August 25, 2025, 12:57am

Hello. I am now suddenly encountering errors like this:

429 RESOURCE_EXHAUSTED. {'error': {'code': 429, 'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.', 'status': 'RESOURCE_EXHAUSTED', 'details': [{'@type': 'type.googleapis.com/google.rpc.QuotaFailure', 'violations': [{'quotaMetric': 'generativelanguage.googleapis.com/generate_content_free_tier_input_token_count', 'quotaId': 'GenerateContentInputTokensPerModelPerMinute-FreeTier', 'quotaDimensions': {'location': 'global', 'model': 'gemini-2.5-pro'}, 'quotaValue': '125000'}]}, {'@type': 'type.googleapis.com/google.rpc.Help', 'links': [{'description': 'Learn more about Gemini API quotas', 'url': 'https://ai.google.dev/gemini-api/docs/rate-limits'}]}, {'@type': 'type.googleapis.com/google.rpc.RetryInfo', 'retryDelay': '8s'}]}}

If I am getting it right, the free tier now has even more cuts, up to 125000 max tokens? The Rate limits | Gemini API | Google AI for Developers page still shows 250k for all three 2.5 models, pro, flash and light. Are they not being updated yet or is it some bug on the provider’s side?

Topic		Replies	Views
Receiving 429 “Quota Exceeded” on Gemini 2.5 Pro (Tier 1) While Usage Is Under 1% Gemini API api , gemini	37	1722	December 23, 2025
Persistent 429 Errors (Quota Exceeded) for all Gemini Models except 2.5 Flash on Free Tier Gemini API billing , gemini-flash-2-5	3	1757	June 10, 2025
Gemini 2.5 Pro Experimental 03-25 Free Limits - 429 Error Gemini API api , rate-limits , gemini-2-5	5	729	May 24, 2025
429 error with quota with tier Gemini API ai-studio , api , gemini	45	2370	March 12, 2026
Gemini API 429 RESOURCE_EXHAUSTED Error on Tier 1 Gemini API gemini-api , api-key , billing	38	2522	March 18, 2026

Over 300k context tokens lead to 429 error for both Gemini-pro and flash on free tier

Related topics