Scaling Gemini Flash

Flash is a very generous model. My question is, how can we scale this? I haven’t exactly exhausted the throughput of the api which is 1000 invocations per minute. Assuming I get past this point, what are the options for doing more than 1000/min?

1 Like

That’s where the 2 asterisks come into play in Giá của Gemini API  |  Google AI for Developers.
You talk to the sales people, and they increase your limits. Usually, a product at launch will not immediately reach that level of demand, so this is a very good problem to have (it means the product is successful).

Hope that helps!

2 Likes

We are also working to raise the limits!

3 Likes