Documentation on Gemini’s rate limiting is scarce. I’m currently developing a rate limiter for my project, which integrates with numerous models. I need to know: for Gemini, are input tokens counted toward the rate limit TPM when a request is initiated?
1 Like
Hi @hh_yu ,
Apologies for the delayed response. Could you please share whether you’ve noticed any behavior (for example, delayed 429s) that might indicate tokens are accounted for after completion?
Thanks