I’ve been using the gemini-2.0-pro-exp
API for several months with no problems.
It’s the “Gemini AI” module of Make, so I’m not sure which endpoint exactly, but it involves Role User and Parts with Message Type “Text” and a "Text’ value, like a chat completion.
Today, I suddenly started getting:
[429] You exceeded your current quota. Please migrate to Gemini 2.5 Pro Preview (models/gemini-2.5-pro-preview-03-25) for higher quota limits. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.
Even if I do switch it to gemini-2.5-pro-preview-03-25
as advised, I continue to get the exact same 429 error message.
In Google Cloud Console, where should I check?
In https://console.cloud.google.com/iam-admin/quotas, for project "Generative Language Client, I see a range of duplicate entries for Generative Language API and Gemini for Google Cloud API across different regions, but nothing has high usage; all tiny.
ChatGPT tells me:
- “Google has begun strictly enforcing usage limits on older Gemini models, such as
gemini-2.0-pro
. Even minimal usage can now trigger quota errors.” - "Some users report that requests made to newer models like gemini-2.5-pro-exp-03-25 are being incorrectly attributed to older models in the usage dashboard, leading to premature quota exhaustion. "
- “The token-per-minute (TPM) limits for certain models have been lowered. For instance, gemini-2.5-pro-exp-03-25 previously allowed 1 million TPM but has been reduced to 250k TPM, causing 429 errors for larger payloads.”
What’s going on, please?