For the past few days, I’m always getting 429 errors returned.
“code”: 429,
“message”: “Quota exceeded for quota metric ‘Generate Content API requests per minute’ and limit ‘GenerateContent request limit per minute for a region’ of service ‘generativelanguage.googleapis.com’ for consumer ‘project_number:xyz’.”,
But when I go look at my usage details, it’s showing next-to-no traffic, just the 429s.
I have no idea where to go from here aside from making a new account. Anyone have any ideas?
Your traffic by response code will not be a good indicator of your quota limits being reached at the API level. You will want to look specifically at your Gemini quotas. In google cloud console, select the project you have tied to the gemini API key, go to IAM & Admin, click on quotas & system limits, and once on this page, use the filter to enter in the word gemini. You will see your specific limits and usage. Note that the dimensions column will display the specific model for which those limits occur (so you will see every single model Google has, what the limits are for that model, and your usage of those limits). Next to the entry that is for the model you are using, is a little graph icon that allows you to see historical usage charts. Click on this and you should see something like the following:
Hey, I think I’m experiencing the same issue, since today. As you can see, my peak usage is lower than 1%. I can easily reproduce the error by sending 2 consecutive requests containing an image, and a prompt to extract JSON from the image.
Can you post the error code you are receiving back from the API call that’s failing? The following is a good resource on understanding the return errors but would be very strange if you are getting the same errors as OP under such little utilization.
That’s very strange. Are you certain this is a paid account? That’s very low usage for a paid account but would certainly be happening for a free account. I would just triple check this. Does it happen consistently or just randomly. I suspect Google is seeing your API requests as a free account tier with the low rate limits. Might need to reach out to them to verify and switch if you’re certain it’s a paid and you have a working CC in the billing section for this project in GCP.
As you can see the usage on this screenshot appears under paid tier one, and i didn’t see any usage for the free charts. Also I’ve been using those API keys both locally and in prod for months at this point. These issues started appearing on 2 unrelated GCP projects today.
Also, you mention that the screenshot says it’s the paid tier one. Your screenshot shows it’s using flash-1.5. flash 1.5 has a free tier and a paid tier. So just seeing flash 1.5 there doesn’t mean it’s paid. You will know if it’s paid by going to ai studio, click settings then click plan information. You will see something like this screenshot.
We encountered the same issue. Our workaround involves using different prompts—placing some text in the system prompt and some in the user prompt. I’m not sure if this solution will work for every error case, but it’s definitely worth trying.
I am also having an issue with this, using flash-2.0 on paid. It says peak usage is about 30/2000 request and 600k/4m tokens. I’m certain its a paid account, as I can see a of couple cents being spent. Getting 429 all over.
Yes, we got the same error too. We are using Google’s Gemini API through their OpenAI compatible endpoint, but has been getting 429 errors even if we send just 10 concurrent completions request. 100% sure our API keys and the gcp project associated with it has billing activate.
Same here. I’m getting a lot of Error 429 with model gemini-2.0-flash-thinking-exp-01-21 but googles dashboard shows my API key as getting a lot of error 503