Edit:gemini-1.5-pro-002 does work in europe-west3 now, but:
My quota for gemini-1.5-pro in europe-west3 is 50 requests / minute.
For gemini-1.5-pro-001 this works out exactly
However, for gemini-1.5-pro-002 it’s closer to 1 request / minute
The billing account is active but still has free usage credits. Could this be the reason for the lower limits with 002?
Apart from that, where can I see the quota and usage?
Original post:
I can’t use gemini-1.5-pro-002 in europe-west3, but it works in all other regions.
This code fails:
import vertexai.generative_models as genai
model = genai.GenerativeModel("gemini-1.5-pro-002")
model.count_tokens('test')
with this exception:
_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.FAILED_PRECONDITION
details = "Project `********` is not allowed to use Publisher Model `projects/horama-dev/locations/europe-west3/publishers/google/models/gemini-1.5-pro-002`"
debug_error_string = "UNKNOWN:Error received from peer ipv4:********:443 {created_time:"2024-09-25T15:43:53.563304941+02:00", grpc_status:9, grpc_message:"Project `********` is not allowed to use Publisher Model `projects/********/locations/europe-west3/publishers/google/models/gemini-1.5-pro-002`"}"
I’ve tried using the same model from different projects/accounts/billing accounts, but I’m getting the same error. All other regions seem to work. What might be the issue here?
Here is limits for free users. However, it is not clear to me exactly how these limits work. This afternoon, at approximately 12:30 (UTC +4) I was chatting with the model without any problems, but now, at night, I periodically get the error “You reached your rate limit”. Also I use Ai studio, but in fact the limits should be the same for all free users.
I have same issue.
I’m using “gemini-flash-1.5-002” on VertexAI.
If I send consecutive requests (5-10 requests per minute), I will get the following error:
"error": {
"code": 429,
"message": "Online prediction request quota exceeded for gemini-1.5-flash. Please try again later with backoff.",
"status": "RESOURCE_EXHAUSTED"
I’ve used up my free Google Cloud credits.
How can I fix it.
thanks.
I’m seeing the same problem since moving from 001 to 002 recently (a week ago). I’m seeing something like 10 requests / min on 002 before getting 429’d. Certainly my increased quota of 30 / min is not being applied to 002 usage.