In Gemini's rate limiting, is the input token immediately counted toward the TPM when the request is initiated? Or is it asynchronously tallied into the TPM only after the request completes?

hh_yu · December 11, 2025, 5:33am

Documentation on Gemini’s rate limiting is scarce. I’m currently developing a rate limiter for my project, which integrates with numerous models. I need to know: for Gemini, are input tokens counted toward the rate limit TPM when a request is initiated?

Nireeksha_K_A · December 26, 2025, 5:48am

Hi @hh_yu ,

Apologies for the delayed response. Could you please share whether you’ve noticed any behavior (for example, delayed 429s) that might indicate tokens are accounted for after completion?

Thanks

Topic		Replies	Views
Does Gemini API's Token per minute (TPM) rate limit apply only on input tokens or input + tool use tokens also? Gemini API gemini	0	69	February 11, 2026
Where/how do I find remaining tokens/requests count after making a request? Gemini API docs , ai	1	859	October 1, 2024
Transparency regarding rate limits Gemini API rate-limits	4	275	January 22, 2026
429 Too many request error for TPM limit, but it does not reset after 10 min Gemini API api , gemini-flash	8	508	December 23, 2025
TPM limit on free tier Gemini API api , models , rate-limits	6	1025	July 2, 2025

In Gemini's rate limiting, is the input token immediately counted toward the TPM when the request is initiated? Or is it asynchronously tallied into the TPM only after the request completes?

Related topics