Gemini 2.0 flash multimodal rate limits

batman2212 · December 13, 2024, 2:49pm

Hello Gemini team,

I want to use it in production and launch my SaaS but came across these 2 limitations:

Rate limits
The following rate limits apply:
3 concurrent sessions per API key

and

Maximum session duration
Session duration is limited to up to 15 minutes for audio. When the session duration exceeds the limit, the connection is terminated.

Source: Multimodal Live API | Gemini API | Google AI for Developers

These are very restrictive for production launch, especially the concurrent conversations limit of 3. Do you have a higher plan I can upgrade to or solution planned for this?

Thanks!

OrangiaNebula · December 13, 2024, 3:19pm

Welcome to the forum. The full name of the model is Gemini 2.0 Flash experimental. That last word means it is not intended for production and in using it, you have agreed to not use it in production when you clicked that “I agree” button.

It will eventually get promoted to non-experimental status. Then you can go ahead and use it.

Hope that helps.

SamRahimi420 · December 18, 2024, 8:56pm

If you’re working with node.js, I made a little “Key Mixer” library specifically for extending Gemini rate limits… It works as a substitute for normal .env environment variables, and rotates multiple keys on a round robin basis.

Essentially, what you do is:

obtain several API keys, one for each of your Google accounts
add them to a keystore.json file (see the example in the npm docs)
npm install key-mixer and then use the package as per docs… each time you get the key for a particular service (such as Gemini), it will give you a different key from your keystone…

The result? In this case, let’s say you have 5 Gmail accounts, and get a free API key for each in AI studio… by using the key mixer your multimodal live rate limits are therefore increased 5x - so instead of 3 concurrent conversations, you can now have 15

Note: I have only tested with small numbers of keys, 5 or less… if you require, say, 150 concurrent connections, DO NOT simply rotate 50 Gemini keys and expect it to work for more than a short time unless you also create your own infrastructure to ensure that these connections are spread out among different servers with different IP addresses, so that it does not trip any automated security mechanisms (of course, you could also originate the connections from the user’s browser, client side, which would avoid this issue altogether, if it’s suitable for your use case - just be prepared to frequently invalidate keys because they won’t be secure and others will start using them)

Alibou99 · December 19, 2024, 1:07pm

Thank you for sharing this brilliant solution and your Key Mixer library! It’s a smart and practical approach to extend API rate limits. I really appreciate you taking the time to share it with the community.

Topic		Replies	Views
Multimodal API rate limits Gemini API api , gemini-flash	1	139	May 19, 2025
Multimodal Live API key user limit? Gemini API api	1	90	February 20, 2025
Is the Gemini Live API rate limit per key or per user? Gemini API gemini , api-key	2	359	April 8, 2025
5 RPM - Will that be increased in future? Gemini API	4	272	May 2, 2024
Inquiry Regarding Rate Limits for Gemini 1.5 Pro on Google AI Studio Google AI Studio gemini-15 , ai-studio , api	7	534	May 15, 2024

Gemini 2.0 flash multimodal rate limits

Related topics