Transparency regarding rate limits

Currently we can only see our current rate limits in AI Studio. Please disclose specific rate limits calculations, or at least the base or average rate limits for every usage tier. We need to know them to plan our projects, for example, synchronous request or queuing, limits for our users, pricing, or should we use another service completely, etc. We don’t want to spend time, money and effort to build the whole system just to find out the rate limits are too low or unusable.

1 Like

Hi @nick5
Welcome to the Google AI Forum!!!

Thank you for your feedback on Gemini API rate limit and for taking the time to share your thoughts with us. We truly appreciate your input, as it helps us continuously enhance the Gemini API experience.

Thanks!

1 Like

Can you explain to me how is this not being reset after 1 minute, the TPM thing, I keep hitting 429 error eventho im tier 1 and im way below RPM and RPD.

It looks like you’re hitting the tokens per minute limit before you even get close to your request limits (RPM/RPD). Since these limits work on a sliding 60-second window, a few large prompts in a row can quickly max out your quota.

A couple of ways to fix this:

  • See if you can reduce the input prompt size

  • Add a small delay between your requests to give your TPM limit a chance to reset.

  • Use a basic retry logic (exponential backoff) so your app pauses requests after hitting the quota.

Yeah, the input tokens was around 25k, then limit hits, I’ve left it for 2 hours same thing, im gonna wait couple of more hours and see.

It was agent running so it makes sense, and their calculation sucks tbh (cline)

We will see what happens