Gemini API 429 RESOURCE_EXHAUSTED Error on Tier 1

I have the exact same issue. I am on Paid Tier 3, using gemini-3-flash-preview and after ~5000 requests in January (each envolved Google Search grounding tool) all I got was 429 errors. The Google’s documentation says “5,000 prompts per month (free), then (Coming soon**) $14 / 1,000 search queries” without elaborating on what soon means. I checked the Quotas & System Limits settings in Google Cloud Console and it says “unlimited”. @Gregory_Jordan Did your issue get resolved somehow?

same error
ResourceExhausted: 429 Resource exhausted. Please try again later. Please refer to Error code 429  |  Generative AI on Vertex AI  |  Google Cloud Documentation for more details.

My issue got resolved by applying a workaround. I switched to Vertex AI, according to the docs, “Billing will start January 5, 2026” for Search Queries beyond 5K , so I gave it a spin, and the 429 errors have disappeared.

Having this issue right now. It shows my RPM has surpassed the 1M tokens, but it seems permanently broken. It keeps responding with 429 errors, even if I wait beyond 60 seconds.

Hey Lexura! This specific issue is that Grounding with Google Search has lower limits, we are working to fix this and also show the limits in the dashboard. Stay tuned!

Got also this 429 error issue.. No where near limits.. Paid Tier 1.

I’m getting the same error with it, I have the tier 1 and my limits aren’t reached but it’s throwing me the 429 error.

The complete error is:

LLM error: { “error”: { “code”: 429, “message”: “Resource has been exhausted (e.g. check quota).”, “status”: “RESOURCE_EXHAUSTED” } }

I too am facing this issue. I was able to submit one batch job and since then I get 429 no matter the size of the batch. Tier 1, no where near any limits. gemini-2.5-flash-lite and even tried gemini-2.5-flash. It has become impossible to use the batch interface.

Hi folks,

It looks like the solution is actually to implement the retry strategy here if you’re on the Pay as You Go plan.

Though in my code, the defaul retry import from genkit/actions didn’t work directly, so I had to implement the algorithm in the code directly so it’s not depended on external dependencies. Here is my algo for brand analysis project i have. I hope this helps:

async (input: BrandAnalysisInput) => {

    const maxAttempts = 5;

let delay = 2000; // start with 2 seconds




for (let i = 0; i < maxAttempts; i++) {

try {

const {output} = await prompt(input);

return output!;

      } catch (e: any) {

// Check for a 429 status code and retry if it's not the last attempt

if (e.status === 429 && i < maxAttempts - 1) {

await new Promise((resolve) => setTimeout(resolve, delay));

delay *= 2; // Exponential backoff

if (delay > 30000) delay = 30000; // Cap delay at 30 seconds

        } else {

// If it's the last attempt or not a 429 error, re-throw.

throw e;

        }

      }

    }

// This line is for TypeScript's benefit and should not be reached.

throw new Error("Flow failed after multiple retries.");

  }