Gemini-1.5-pro-002 quotas lower than 001

seidtgeist · September 25, 2024, 1:46pm

Edit: gemini-1.5-pro-002 does work in europe-west3 now, but:

My quota for gemini-1.5-pro in europe-west3 is 50 requests / minute.
For gemini-1.5-pro-001 this works out exactly
However, for gemini-1.5-pro-002 it’s closer to 1 request / minute

The billing account is active but still has free usage credits. Could this be the reason for the lower limits with 002?

Apart from that, where can I see the quota and usage?

Original post:

I can’t use gemini-1.5-pro-002 in europe-west3, but it works in all other regions.

This code fails:

import vertexai.generative_models as genai

model = genai.GenerativeModel("gemini-1.5-pro-002")

model.count_tokens('test')

with this exception:

_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.FAILED_PRECONDITION
	details = "Project `********` is not allowed to use Publisher Model `projects/horama-dev/locations/europe-west3/publishers/google/models/gemini-1.5-pro-002`"
	debug_error_string = "UNKNOWN:Error received from peer ipv4:********:443 {created_time:"2024-09-25T15:43:53.563304941+02:00", grpc_status:9, grpc_message:"Project `********` is not allowed to use Publisher Model `projects/********/locations/europe-west3/publishers/google/models/gemini-1.5-pro-002`"}"

I’ve tried using the same model from different projects/accounts/billing accounts, but I’m getting the same error. All other regions seem to work. What might be the issue here?

seidtgeist · September 25, 2024, 8:19pm

Okay, this resolved itself. However, now I’m getting 429 errors when performing more than 5 requests per minute:

ResourceExhausted: 429 Online prediction request quota exceeded for gemini-1.5-pro. Please try again later with backoff.

This only happens with gemini-1.5-pro-002. I can’t find a service/consumer/… quota that applies. Any leads?

klinok64 · September 26, 2024, 12:18am

Here is limits for free users. However, it is not clear to me exactly how these limits work. This afternoon, at approximately 12:30 (UTC +4) I was chatting with the model without any problems, but now, at night, I periodically get the error “You reached your rate limit”. Also I use Ai studio, but in fact the limits should be the same for all free users.

Jay · September 26, 2024, 10:54pm

Vitaliy_Mysnyk · October 4, 2024, 9:27am

chihiro_koexuka · October 11, 2024, 8:48am

I have same issue.
I’m using “gemini-flash-1.5-002” on VertexAI.
If I send consecutive requests (5-10 requests per minute), I will get the following error:

"error": {
    "code": 429,
    "message": "Online prediction request quota exceeded for gemini-1.5-flash. Please try again later with backoff.",
    "status": "RESOURCE_EXHAUSTED"

I’ve used up my free Google Cloud credits.
How can I fix it.
thanks.

srellm4 · November 7, 2024, 9:41am

facing the some issue , but i still have credits in my account

Online prediction request quota exceeded for gemini-1.5-pro

stefan.adelbert · November 19, 2024, 6:29am

I’m seeing the same problem since moving from 001 to 002 recently (a week ago). I’m seeing something like 10 requests / min on 002 before getting 429’d. Certainly my increased quota of 30 / min is not being applied to 002 usage.

Topic		Replies	Views
Gemini API defaulting to Free Tier despite being paid Google AI Studio gemini-api , billing	12	480	July 7, 2025
Getting 429 Errors - But Usage Charts Show no Traffic Gemini API api	54	2701	July 3, 2025
429 Quota Exceeded with Gemini Pro API Gemini API gemini-api	21	1017	June 11, 2025
429 Quota exceeded for quota metric 'Generate Content API requests per minute' Gemini API bug , api	3	297	May 13, 2025
Gemini-2.5-pro 429 after upgrade to tier 1 Gemini API gemini , billing	5	297	July 1, 2025

Gemini-1.5-pro-002 quotas lower than 001

Related topics