429 Too Many Requests on Vertex AI API generateContent (Gemini 2.5 Pro)

I’m using the Vertex AI API to send several generateContent requests to the Gemini 2.5 Pro default model in the europe-west1 region (cannot switch to global region as per country laws on the data I’m handling). Occasionally, at seemingly random times during the day, I receive a 429 Too Many Requests error. The specific message is:

{"error":{"code":429,"message":"Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429` for more details.",“status”:“RESOURCE_EXHAUSTED”}}`

I don’t believe I’m hitting any quota limits, as the issue occurs randomly, without any significant load or concurrent requests. Sometimes the 429 error appears several times in a row—four or five, or even more—and then suddenly everything goes back to normal.
I’m not sure whether this is a region-specific issue or something related to my account temporarily exhausting resources.

I have also implemented a retry mechanism with exponential backoff between attempts, up to four retries, but all of them still resulted in 429 errors.

1 Like

I have had the same and in my case I found it was due to a malformed prompt payload to the API.

(Your mileage may vary).

1 Like

I don’t have much to contribute to this discussion, but it’s an old problem that Google has no intention of solving. Anyone who has a Gemini-based pay-as-you-go solution has the same problem. Google says switch to provisioning, it will be better, but I don’t know if that’s true. What I do is switch requests between all European centres to minimise the risk of 429. In case of an error, I repeat the query on another centre.

1 Like

Thank you! I tried implementing this solution, but it doesn’t seem to solve the issue. It just return 429 on any regions unfortunately. Also the provisioning is so expensive for some reason? Starting from 1200$/month?

It should not. Check the list of locations. When you get 429 in one location then switch the request to another. It works for me.

I switch regions, and these are my logs:

2025-11-25 08:39:15 info: [34vtgis] Attempt 1/7 using region: europe-west1

    at async GeminiService.sendPromptWithFilesVertexAI (/app/dist/services/gemini.service.js:505:34)

2025-11-25 08:39:30 error: [34vtgis] Vertex AI error caught:

at async GeminiService.sendPromptWithFiles (/app/dist/services/gemini.service.js:267:20)

2025-11-25 08:39:30 error: [34vtgis]   - Error name: ClientError

2025-11-25 08:39:30 error: [34vtgis]   - Error message: [VertexAI.ClientError]: got status: 429 Too Many Requests. {"error":{"code":429,"message":"Resource exhausted. Please try again later. Please refer to 
 for more details.","status":"RESOURCE_EXHAUSTED"}}

2025-11-25 08:39:30 error: [34vtgis]   - Error code: undefined

2025-11-25 08:39:30 error: [34vtgis]   - Error status: undefined

2025-11-25 08:39:30 error: [34vtgis]   - Stack trace:

ClientError: [VertexAI.ClientError]: got status: 429 Too Many Requests. {"error":{"code":429,"message":"Resource exhausted. Please try again later. Please refer to 
 for more details.","status":"RESOURCE_EXHAUSTED"}}

at throwErrorIfNotOK (/app/node_modules/@google-cloud/vertexai/build/src/functions/post_fetch_processing.js:32:27)

at process.processTicksAndRejections (node:internal/process/task_queues:105:5)

at async generateContent (/app/node_modules/@google-cloud/vertexai/build/src/functions/generate_content.js:59:5)

"stack": "Error: Resource exhausted. Please try again later. Please refer to 
 for more details.\n    at throwErrorIfNotOK (/app/node_modules/@google-cloud/vertexai/build/src/functions/post_fetch_processing.js:32:66)\n    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)\n    at async generateContent (/app/node_modules/@google-cloud/vertexai/build/src/functions/generate_content.js:59:5)\n    at async GeminiService.sendPromptWithFilesVertexAI (/app/dist/services/gemini.service.js:505:34)\n    at async GeminiService.sendPromptWithFiles (/app/dist/services/gemini.service.js:267:20)\n    at async ReportController.processDocuments (/app/dist/controllers/report.controller.js:884:40)\n    at async /app/dist/controllers/report.controller.js:56:38",

at async ReportController.processDocuments (/app/dist/controllers/report.controller.js:884:40)

at async /app/dist/controllers/report.controller.js:56:38

2025-11-25 08:39:30 error: [34vtgis]   - Error cause: {"code":429,"status":"RESOURCE_EXHAUSTED"}

2025-11-25 08:39:30 error: [34vtgis]   - Full error object: {

"stack": "ClientError: [VertexAI.ClientError]: got status: 429 Too Many Requests. {\"error\":{\"code\":429,\"message\":\"Resource exhausted. Please try again later. Please refer to 
 for more details.\",\"status\":\"RESOURCE_EXHAUSTED\"}}\n    at throwErrorIfNotOK (/app/node_modules/@google-cloud/vertexai/build/src/functions/post_fetch_processing.js:32:27)\n    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)\n    at async generateContent (/app/node_modules/@google-cloud/vertexai/build/src/functions/generate_content.js:59:5)\n    at async GeminiService.sendPromptWithFilesVertexAI (/app/dist/services/gemini.service.js:505:34)\n    at async GeminiService.sendPromptWithFiles (/app/dist/services/gemini.service.js:267:20)\n    at async ReportController.processDocuments (/app/dist/controllers/report.controller.js:884:40)\n    at async /app/dist/controllers/report.controller.js:56:38",

"message": "[VertexAI.ClientError]: got status: 429 Too Many Requests. {\"error\":{\"code\":429,\"message\":\"Resource exhausted. Please try again later. Please refer to 
 for more details.\",\"status\":\"RESOURCE_EXHAUSTED\"}}",

"cause": {

"message": "Resource exhausted. Please try again later. Please refer to 
 for more details.",

"name": "Error"

  },

"name": "ClientError",

"stackTrace": {

"stack": "Error: Resource exhausted. Please try again later. Please refer to 
 for more details.\n    at throwErrorIfNotOK (/app/node_modules/@google-cloud/vertexai/build/src/functions/post_fetch_processing.js:32:66)\n    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)\n    at async generateContent (/app/node_modules/@google-cloud/vertexai/build/src/functions/generate_content.js:59:5)\n    at async GeminiService.sendPromptWithFilesVertexAI (/app/dist/services/gemini.service.js:505:34)\n    at async GeminiService.sendPromptWithFiles (/app/dist/services/gemini.service.js:267:20)\n    at async ReportController.processDocuments (/app/dist/controllers/report.controller.js:884:40)\n    at async /app/dist/controllers/report.controller.js:56:38",

"message": "Resource exhausted. Please try again later. Please refer to 
 for more details.",

"name": "Error"

  }

}

2025-11-25 08:39:30 warn: [34vtgis] Vertex AI overloaded in europe-west1 (attempt 1/7). Switching to europe-west4 in 20s...

2025-11-25 08:39:50 info: [34vtgis] Attempt 2/7 using region: europe-west4

2025-11-25 08:40:04 error: [34vtgis] Vertex AI error caught:

2025-11-25 08:40:04 error: [34vtgis]   - Error name: ClientError

2025-11-25 08:40:04 error: [34vtgis]   - Error message: [VertexAI.ClientError]: got status: 429 Too Many Requests. {"error":{"code":429,"message":"Resource exhausted. Please try again later. Please refer to 
 for more details.","status":"RESOURCE_EXHAUSTED"}}

And so on for all the regions that gemini 2.5 pro handles. I don’t know if maybe the implicit cache hits so it immediately returns the error.

Can you check if you get the same with Gemini 2.5 Flash? My solution uses this model.

Hi guys! I’ve been having similar problems. This is the post. I don’t know if you are using too many requests form the same IP (maybe more than one PC from the same LAN or VPN), but I have a feeling that it might be some kind of temp IP blocking due to abuse counter measures.

Hello @frmusso , @paulvancotthem , @Piotr_Gajda , @Cristobal

I understand your frustration with this issue. Could you please provide the following details to help us investigate?

  • Complete Error Message: The full JSON response.

  • Billing Tier: (e.g., Free, paid etc.

  • Model Name & Region: (e.g., gemini-2.5-pro in europe-west1 etc).

  • Platform: (AI Studio, Vertex AI SDK, or REST ).

  • Task Description: Briefly, what are you using the API for? (e.g., "summarizing large PDFs"etc).

    also let me know are you still facing this issue?

Thanks!

Jan 7/2026. I also have the same error 429

{ “error”: { “code”: 429, “message”: "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/usage?tab=rate-limit. ", “status”: “RESOURCE_EXHAUSTED” } }

I have the Google AI Pro monthly membership. The n8n workflow is no longer working due to this error. It’s used to generate prompts, images, and videos.

Hello @Never_Finished_Never ,

Please check your usage and billing on AI studio Dashboard ,if you have not exhausted please share details mentioned below:

  • Complete Error Message: The full JSON response.

  • Billing Tier: (e.g., Free, paid etc.

  • Model Name & Region: (e.g., gemini-2.5-pro in europe-west1 etc).

  • Platform: (AI Studio, Vertex AI SDK, or REST ).

  • Task Description: Briefly, what are you using the API for? (e.g., "summarizing large PDFs"etc).

  • provide the project number (not the project ID) via direct message

In my original post, you can find all the answers to these questions. I sort of worked around the issue by implementing a retry mechanism across different regions, so I assume the error occurs when a specific region is temporarily exhausted. However, the error is still quite annoying, and I continue to experience it.

I’m using the Vertex AI SDK to process a few PDFs in a specific format. Paid tier.

Hello @frmusso ,

Could you please DM me your project number(not the project ID)?

@Mahesh_Sutar , I am having the same exact issue. Here is the information you requested from another user:

Complete Error Message:
{“error”:{“code”:429,“message”:“Resource exhausted. Please try again later. Please refer to
Error code 429  |  Generative AI on Vertex AI  |  Google Cloud Documentation for more details.”,“status”:“RESOURCE_EXHAUSTED”}}

Billing Tier: Paid (billing account in good standing)

Model Name & Region: gemini-2.5-flash in us-central1. Also tested us-east1 - same 429 error. Switching to global endpoint resolved the issue.

Platform: Vertex AI SDK (@google/genai npm package v1.x with vertexai: true)

Task Description: Audio transcription application - transcribing audio dictations (20-60 second WebM audio files) into structured outputs.

Additional Context:

  • Was working fine earlier in the day
  • The 429 errors persisted for 15+ minutes despite exponential backoff retry logic (5 attempts, 2s base delay)
  • GCP Console showed only 0.02% quota utilization on “Generate content requests with audio input per minute” (656/3,456,000)
  • Simple text-only requests (no audio) also returned 429 in us-central1
  • No incidents reported during the timeframe
  • Switching from us-central1 to global endpoint immediately resolved the issue
  • Occurred on January 20, 2026 between approximately 13:30-14:30 PST

This appears to be a regional capacity issue rather than a quota issue, as the quota dashboard showed minimal usage while requests were consistently rejected.

2 Likes

Yesterday was a disaster for a while, all regions were exhausted. Also the global endpoint returned the same error to me during retries in several occasions.

@Mahesh_Sutar I DM’d you with the requested information.

I’m also experiencing a sudden spike in 429 errors recently with simple generateContentStream requests. Could you please investigate this?