Flash is a very generous model. My question is, how can we scale this? I haven’t exactly exhausted the throughput of the api which is 1000 invocations per minute. Assuming I get past this point, what are the options for doing more than 1000/min?
1 Like
That’s where the 2 asterisks come into play in Giá của Gemini API | Google AI for Developers.
You talk to the sales people, and they increase your limits. Usually, a product at launch will not immediately reach that level of demand, so this is a very good problem to have (it means the product is successful).
Hope that helps!
2 Likes
We are also working to raise the limits!
3 Likes