I haven’t been able to use Gemini for a few days; previously, there were no quota issues at all. I am currently in Tier 1, and it is giving me these error messages. I checked the available quotas, and they are nowhere near full. [429 Too Many Requests] Resource exhausted. Please try again later.
Following, also getting a lot of 429 errors, not based on usage
I’m also getting a bunch of 429s even tho I’m well under the limits. I have a Tier 1 Project.
It seems Google has restricted the use of Gemini 2.0 Flash for previously created “open” API keys. Therefore, to use Gemini with my API key created on January 19th, that API key would need to be bound to a service account, for which there is no longer an option in the console. However, this model cannot be used with a new key. Solution: A new key must be created in AI Studio, and this new key can be used with the 2.5 Flash model. This model operates with stricter limits than 2.0 Flash, but at least it works.
Getting 429 resource exhausted error for gemini 3.1 pro for the past week
503 - with priority interface optimization.
Gemini has `Priority inference` - https://ai.google.dev/gemini-api/docs/priority-inference#how-to-use
And we recently allowed for `priority` for all of our AI uses
Sadly I cant get it to preform, we still get:
```
503 UNAVAILABLE. {‘error’: {‘code’: 503, ‘message’: ‘This model is currently experiencing high demand. Spikes in demand are usually temporary. Please try again later.’, ‘status’: ‘UNAVAILABLE’}}
```
The service_tier flag is set and the response acknowledges the setting - we are running `Tier 2` on the API key:
> Priority inference is available to Tier 2 - Tier 3 users across the GenerateContent API and Interactions API endpoints.
And still nothing,
I know that its not 100% grantee to give a response at peak-times but for the few weeks its out its completely useless and never helps avoid the 503 or speed response times like they promise
Am I possibly missing something?
did anyone else use this and actually get something out of it?
Git issue - GenAI service_tier - is it even working?? · Issue #2448 · googleapis/python-genai · GitHub
Maybe version 3.1 pro doesn’t work, but 3.0 pro does
Http response at 400 or 500 level, error; http status code: 500. Why is it like this? And why is the number of tokens in red if I have a paid Tier 1?
As an Android developer actively working on my projects, the new ‘compute credit’ limit management model and the 5-hour restrictions are resulting in a significant degradation of productivity.
Not only is access becoming erratic, but the logical quality in code generation and debugging has decreased, forcing me to spend more time correcting errors introduced by the model itself. The reality is that, in this state, Gemini’s efficiency is not up to the standards of a professional development environment, forcing many of us to consider migrating to alternatives like Claude, where the workflow is much more consistent and the logical precision is superior.
Hi,
The 503 errors from the Gemini 2.5 Flash Pro is impacting production usage severely. Please see screenshots.
Model Name: Gemini 2.5 Flash
Billing: Tier 1
My app has been receiving 503 Service unavailable response on a daily basis for the last month. At this stage the API is impacting all production workloads and users and I am needing to replace functionality with Anthropic. This is irrespective of whether I switch to a new model. Please help prioritize the fix for this. I don’t believe the quality exists for me to upgrade usage.
def ask_gemini(prompt):
\# 1. Cache Check
if prompt in cache:
return cache\[prompt\]
\# 2. Rate Limiter
rate_limiter.wait()
\# 3. Queue
queue.add(prompt)
\# 4. Circuit Breaker
if circuit_breaker.is_open():
return "Service temporarily unavailable"
for retry in range(5):
try:
response = gemini.generate(prompt)
cache\[prompt\] = response
return response
except (429, 503):
sleep(2 \*\* retry)
\# 5. Fallback Model
return flash_model.generate(prompt)
I use the RooCode plugin in the international version of Trae to call GCP Vertex AI using the free trial credits provided by Google. Sometimes the call succeeds, sometimes it doesn’t. After using it for a while, I either get a 429 error or the AI doesn’t respond at all. Can you help adjust the quota? The plugin might make calls at a relatively high frequency. I have $300 in trial credits, but with this quota limit, I won’t be able to use them up within the 3-month period, which somewhat affects the user experience.
Hi all. I’m as affected as most users here (I’m getting no API tokens available on any Gemini 2.5 models for the last 4 hours). But I do want to add for those who are getting frustrated with 2.0 models not working, please see https://ai.google.dev/gemini-api/docs/pricing#gemini-2.0-flash
The 2.0 models were deprecated on June 1st, so I would not be surprised that inference from them is spotty at best.
If your applications have agents or processes tied to specific low-cost models, you could have an agent run a weekly check for deprecation notices (not just for Google) on official API pricing pages.
Hope that helps. I know it really doesn’t.

