Scaling Gemini Flash

Jason_C · August 11, 2024, 10:06pm

Flash is a very generous model. My question is, how can we scale this? I haven’t exactly exhausted the throughput of the api which is 1000 invocations per minute. Assuming I get past this point, what are the options for doing more than 1000/min?

OrangiaNebula · August 11, 2024, 10:18pm

That’s where the 2 asterisks come into play in Giá của Gemini API | Google AI for Developers.
You talk to the sales people, and they increase your limits. Usually, a product at launch will not immediately reach that level of demand, so this is a very good problem to have (it means the product is successful).

Hope that helps!

Logan_Kilpatrick · August 12, 2024, 1:20pm

We are also working to raise the limits!

Topic		Replies	Views
"How to Request an Increase in Token Limit per Minute for the Gemini API?" Gemini API gemini-15 , api	4	382	November 21, 2024
Multimodal API rate limits Gemini API api , gemini-flash	1	102	May 19, 2025
Gemini 2.0 flash multimodal rate limits Gemini API api , models	3	3586	December 19, 2024
5 RPM - Will that be increased in future? Gemini API	4	235	May 2, 2024
You've reached your rate limit Google AI Studio ai-studio	7	1202	June 26, 2024

Scaling Gemini Flash

Related topics