429s in Vertex AI for Gemini-2.5-Flash-Lite in Europe

Hi.

As other posts have noted, there seems to be a persistent bug that leads to 429 errors for europe endpoints. In our case, this is for gemini-2.5-flash-lite.

It makes vertex AI extremely unreliable, and despite exponential backoff - we constantly get ‘too many requests’ and ‘resource exhausted’ for periods of a few hours, which then goes away.

Our code is configured to try all the europe endpoints, yet we get this error more or less regardless of where we try.

Are there any SLAs in place for this, and is this a known issue for which a fix is being deployed? Our customers are unhappy with the latency and we will ultimately switch to another provider if this persists.

1 Like

Yes, I have exactly the same issue, no idea what is going on, status shows that it’s all okay but getting 429 on most requests.

How can we get someone to look into this?

1 Like

Same here, 98 % of our gemini requests ended up with status 429 in a period over 12 hours. We have failovers to all EU regions, but that did not help.

1 Like

And here. I don’t know how you are supposed to use this service for production, it’s a throw of the dice if it’s going to work or not.

2 Likes

Same problem here! Keep getting 429 while trying to use Gemini-2.5-flash-lite via Vertex AI on paid tier 3. I am using the same failovers approach as @mkaloer but it doesn’t seem to work.
Any update from the technical team would be appreciated!"

Hi all,
Got this from Google Cloud support. I’ll keep you updated if I hear more.

Hello ,

Thank you for reaching out. I have taken a closer look and it appears that your issue is related to a product outage that has been resolved as of 2026-02-24 05:37 PST. Our team is still working on investigating the root cause of the issue. I will keep you posted once hearing from our team.

Same, facing this even with US servers. Extremely unreliable

Does anyone from Google care to comment on this? Would be nice to understand when this can be resolved.

I’m getting 503s now for gemini-2.5-flash-lite

ApiError: {“error”:{“code”:503,“message”:“This model is currently experiencing high demand. Spikes in demand are usually temporary. Please try again later.”,“status”:“UNAVAILABLE”}}

It would be nice to have a proper status page which shows true error status and not just the ones they choose to acknowledge