Gemini API 429 RESOURCE_EXHAUSTED Error on Tier 1

I have the exact same issue. I am on Paid Tier 3, using gemini-3-flash-preview and after ~5000 requests in January (each envolved Google Search grounding tool) all I got was 429 errors. The Google’s documentation says “5,000 prompts per month (free), then (Coming soon**) $14 / 1,000 search queries” without elaborating on what soon means. I checked the Quotas & System Limits settings in Google Cloud Console and it says “unlimited”. @Gregory_Jordan Did your issue get resolved somehow?

same error
ResourceExhausted: 429 Resource exhausted. Please try again later. Please refer to Error code 429  |  Generative AI on Vertex AI  |  Google Cloud Documentation for more details.

My issue got resolved by applying a workaround. I switched to Vertex AI, according to the docs, “Billing will start January 5, 2026” for Search Queries beyond 5K , so I gave it a spin, and the 429 errors have disappeared.

Having this issue right now. It shows my RPM has surpassed the 1M tokens, but it seems permanently broken. It keeps responding with 429 errors, even if I wait beyond 60 seconds.

Hey Lexura! This specific issue is that Grounding with Google Search has lower limits, we are working to fix this and also show the limits in the dashboard. Stay tuned!

Got also this 429 error issue.. No where near limits.. Paid Tier 1.

I’m getting the same error with it, I have the tier 1 and my limits aren’t reached but it’s throwing me the 429 error.

The complete error is:

LLM error: { “error”: { “code”: 429, “message”: “Resource has been exhausted (e.g. check quota).”, “status”: “RESOURCE_EXHAUSTED” } }

I too am facing this issue. I was able to submit one batch job and since then I get 429 no matter the size of the batch. Tier 1, no where near any limits. gemini-2.5-flash-lite and even tried gemini-2.5-flash. It has become impossible to use the batch interface.

Hi folks,

It looks like the solution is actually to implement the retry strategy here if you’re on the Pay as You Go plan.

Though in my code, the defaul retry import from genkit/actions didn’t work directly, so I had to implement the algorithm in the code directly so it’s not depended on external dependencies. Here is my algo for brand analysis project i have. I hope this helps:

async (input: BrandAnalysisInput) => {

    const maxAttempts = 5;

let delay = 2000; // start with 2 seconds




for (let i = 0; i < maxAttempts; i++) {

try {

const {output} = await prompt(input);

return output!;

      } catch (e: any) {

// Check for a 429 status code and retry if it's not the last attempt

if (e.status === 429 && i < maxAttempts - 1) {

await new Promise((resolve) => setTimeout(resolve, delay));

delay *= 2; // Exponential backoff

if (delay > 30000) delay = 30000; // Cap delay at 30 seconds

        } else {

// If it's the last attempt or not a 429 error, re-throw.

throw e;

        }

      }

    }

// This line is for TypeScript's benefit and should not be reached.

throw new Error("Flow failed after multiple retries.");

  }

So it’s been over a month since the Gemini paid API became useless and we have no official solution or at least a workaround from Google. @chunduriv ?

2 Likes

Same issue huere since last saturday. I’m on paid tier 1, under my quotas

Hi, I’m on Vertex AI, facing constant 429 with gemini-3-pro-image-preview model, I have semaphores, retries (with delays of 10+ secs), and it still mostly returns 429s. No way to increase the quota. What is happening?

Experiencing the same 429 resrouce exhausted issue here.
I recently started a video analysis project using Vertex AI paid tier 1 and gemini-3-pro-preview, uploading from GCS. During the day on sunday/yesterday (first attempts) everything I did worked fine. Over night I was experiencing intermittent 429 error messages and now all morning I can no longer query anything at all… no matter how long I wait.
It’s not clear what rate limit I hit (if any) or why. Very frustrating.

It seems there’s no clear answer on this thread and the google devs arent offering solutions. Is there a higher tier of google support that can potentially help with this for a cost? If so, Ill report back to this thread how I fix it.

Persistent issue with gemini-3.1-pro-preview and gemini-3-pro-preview on my self-hosted chatbot. Yesterday I thought it was due to long context, but the error continues today. Note that gemini-3-flash-preview and smaller models are unaffected. As a Paid Tier 1 user, I’d like to understand why these specific models are failing and if there are any account-level restrictions in place.

I swear i was just about to give up, and then i saw your comment. I already changed the models once but i decided to give it another shot and changed it to: gemini-2.5-flash, and now it works! I guess there was a problem with 2.0 or something

I’m on tier 3 and hitting same issues on gemini-3-flash when using search grounding. high severity issue - where’s the support?

Same issue on gemini-2.5-flash-image. I’m getting constant 429 . I’m in paid tier and my usage is very less like around 1%

Exact same issue here using gemini-2.5-flash-image. Was working fine previously, today constant 429 errors - Resource has been exhausted (e.g. check quota). My quotas are fine, well under. Billing all setup. On paid plan.

2 Likes

Same issue on my side: Getting 429 every few requests although i am far below my quotas