Stuck in tier 1 of gemini API, tried Vertext AI, hitting 429. What to do?

desperately trying to raise the rate limits on Gemini 3.1 Pro and Gemini 3 pro image

I’ve tried using pretty much all options, gemini AI (AI studios API), vertext AI (flex/standard/priority paygo) and hitting limits on all of them barely doing anything like 2-3 requests per min.

This is crazy I can’t run my up in production like this.

We’ve spent more than enough to meet the $250 requirement for Tier 2 and longer than 30 days ago. Yet we’re still stuck and there’s no button anywhere for this.

What do I do? Does any one have a solution for this?

@chunduriv maybe you’re aware of how to fix this?

1 Like

Same here. My account has already met the $250 spend and 30-day requirement, but it’s still stuck in Tier 1. This is blocking our production environment. I can provide my Project ID—could someone please help look into this?

2 Likes

Fought this for 4 weeks. Went through billing support, they sent me all around. Finally got referred to sales.

Here is the TLDR. Contact your Project Rep. Schedule a meeting. Sign up with a partner. I think this is google’s way of ensuring they get your money. They aren’t going to increase your limits(and effectively what they are letting you “spend”, without knowing you are good for the bill)

They recommended I sign with one of their google cloud premier partners. I did this, and SAME day was granted a custom limit on Vertex and resolved my issues. Still hit 429s because google is clearly having internal problems and not telling us.

Gemini 3 Pro is screwed. 2.5 pro is the only way I don’t get rate limited currently. Hopefully they announce something soon but they seem to be silent about this.

The worst part is they have ABSOLUTELY ZERO documentation about this.

But seriously. Contact your cloud rep and schedule a meeting. The rep can help with everything. I couldn’t even file a f*** ticket with support. Still can’t. But I can email the rep/now my partner on this and get stuff sorted so much quicker.

I fought this for a month. Finally have some sort of a path forward.

My Understanding of why they do this: Like I said with the billing, they aren’t going to give you the moon in resources and have you not pay the bill. By going through a partner, the partner then is “on the hook” for the bill. So google gives them the ability to give out tons of resources. I’m sure there are other reasons yada yada but it comes down to money.

Hope this helps and I hope people can see this.

Tags: Vertex AI API 429 errors, Vertex AI API rate limits, Vertex AI API rate limit increase, Vertex AI rate limit increase, Resource Exhausted, Unable to request increase on base models vertex ai api, gemini 3, gemini 2.5 pro,

5 Likes

Thanks so much, yes this is helpful and a huge pain in the ass lmao.

I don’t understand why if the money is the issue they can’t just allow to prepay or something to get to the next tier.

None of the other API providers have this issue.

This isn’t a commercial project but we’d use like up to 20,000 to 40,000 nano banana pro requests a month for internal use.

I opted out to vertex AI flex because it doesn’t have those limits but since everyone is having 429 errors that’s a dud too.

I don’t know if it’s worth doing all you’re saying for this much consumption because I am not the owner of the project (I guess if I become the admin I could do it maybe?). If there was another nano banana pro provider I’d already switch lol

1 Like

I was literally in billing support offering to throw them money. I said I would prepay for thousands in credits if it meant I could unlock it and they wouldn’t budge. I really don’t know the internal workings, but I did find that this solution worked for me. We are a startup and not fully commercial, we needed the “hybrid approach”

One thing to note. the “partner” program is actually super dope. ITS THE SAME COST AS GOING THROUGH GOOGLE FOR TOKENS. I pay nothing extra. I just get the normal bill that I would if it was my own billing account, instead its a billing account under a “partner setting” so by default it unlocks the project and has way more limits.

Its the same cost. Absolutely crazy. Contact your admin and have them initiate the process.

3 Likes

You’re not going to believe the roller coaster I’ve been on with GCP. I was so hyped after getting the $300 credit, fantasizing about having access to the enterprise-grade Gemini preview series models on Vertex AI. But that fantasy was shattered by a constant barrage of 429 errors. The preview models are practically unusable—frequent rate-limiting and slow first-token responses have completely killed the ‘premium’ experience for me. I’m at the point of giving up on the preview series entirely. Even worse, I’m still hitting endless 429s when calling Gemini 2.5 Pro on Vertex! The whole thing is driving me crazy. Ironically, the free Google AI Studio is running 2.5 Pro perfectly smoothly. At this rate, I won’t even be able to burn through the $300 credit in three months. So much for ‘enterprise-grade’.

1 Like

Yep Same experience over here. I had to go to vertex for Full ZDR for my platform. ZDR wise and for compliance its a great box to check! but we have been fighting errors since. It worked well for like a week.

Totally agree, preview models are shit. 2.5 Pro is the way. I think they just released this silently: Priority PayGo  |  Generative AI on Vertex AI  |  Google Cloud Documentation

However- I am not sure if this is going to fix any problems. Might be worth the try if you can afford it. I haven’t implemented it yet personally.

Maybe they are moving towards this to get more $$$ out of folks. 1.8x token cost. I did see they updated the docs recently on the 429 errors. They recommend using the Jitter strategy. We found that we were personally running multiple calls in parallel. Those calls all initiated at the exact same time. Even though they were under the rate limits technically, it was still triggering the 429 error. So we added a priority queue with a randomness factor to batch the async calls out in close succession to get past this. Plus we added the jitter like the docs recommended. Before we were just using exponential backoff and it was complete shit. Didn’t do much.

Retry strategy  |  Generative AI on Vertex AI  |  Google Cloud Documentation.

I am not sure if they just recently updated this, but seeing as I was going through this same thing with the 429 Error ~3 Weeks ago(due to billing account factors) and DIDN’T see these docs, I have to assume its somewhat newer. Who knows. The Jitter retry seems to work. we still get 429s but can often get the calls to still go through on 2.5 Pro Vertex.

Give it a try. 3 Pro Preview still no bueno as you mentioned.

Happy to provide more insight if you need. Just my experience. I fought them for a month, finally got a glimpse of green grass and am not struct down with a somewhat reasonable path forward.

Cheers-

B

2 Likes

Yeah if there was another sota image generation + edit model I’d swap already. But unfortunately for my use case banana pro is the absolute best model.

Vertex AI flex doesn’t have the rate limit issues but still has either network issues or 429s

I do have jitter implemented but I’ll try to space out the calls over time instead of bursting them. Paygo priority I’ve also tried seemed to have the same issues with 429s.

But at least I understand these are separate problems.

1 Like