5 RPM - Will that be increased in future?

Just 5 requests per minute is very little, will this increase in the near future?

1 Like

Hi Ana, thanks for your post!

Which model are you referring to? Gemini 1.5 Pro has 2RPM at the moment and Gemini 1.0 Pro has 15.

We have a pay-as-you-go option launching very soon that provides higher RPMs :slight_smile:

5 Likes

I am a bit confused about the rate limits too, as far as I can see the limits for the latest Gemini models is:

Free tier: 2 RPM/50 RPD
Pay as you go: 10 RPM/ 2000 RPD

The same site also states that you can request higher rate limits, but by how much?

It’s also mentions that I can migrate to the Vertex AI platform on Google Cloud which may offer higher limits. But according to the following table, the rate limits are the same:

Features Google AI Gemini API Google Cloud Vertex AI Gemini API
Latest Gemini models Gemini Pro and Gemini Ultra Gemini Pro and Gemini Ultra
Sign up Google account Google Cloud account (with terms agreement and billing)
Authentication API key Google Cloud service account
User interface playground Google AI Studio Vertex AI Studio
API & SDK Python, Node.js, Android (Kotlin/Java), Swift, Go SDK supports Python, Node.js, Java, Go
Free tier Yes $300 Google Cloud credit for new users
Quota (Request per minute) 60 (can request increase) Increase upon request (default: 60)
Enterprise support No Customer encryption key, Virtual private cloud, Data residency, Access transparency, Scalable infrastructure for application hosting, Databases and data storage
MLOps No Full MLOps on Vertex AI (Examples: model evaluation, Model Monitoring, Model Registry)

(Ref: link)

2 Likes

If I’m not mistaken, until yesterday it was even less, it was 5 RPM on pay-as-you-go.

Another thing, the deadline for releasing pay-as-you-go access would be May 2nd, now it is May 14th.
In a production application this limit would certainly be a problem.

1 Like

I’m planning to use Pay-as-You-Go but 10 RPM would not be enough for my application, because for each user request I need to make 2 requests to Gemini to refine the response. This would support a maximum of 5 simultaneous users.

1 Like