I’m using the Vertex AI API to send several generateContent requests to the Gemini 2.5 Pro default model in the europe-west1 region (cannot switch to global region as per country laws on the data I’m handling). Occasionally, at seemingly random times during the day, I receive a 429 Too Many Requests error. The specific message is:
I don’t believe I’m hitting any quota limits, as the issue occurs randomly, without any significant load or concurrent requests. Sometimes the 429 error appears several times in a row—four or five, or even more—and then suddenly everything goes back to normal.
I’m not sure whether this is a region-specific issue or something related to my account temporarily exhausting resources.
I have also implemented a retry mechanism with exponential backoff between attempts, up to four retries, but all of them still resulted in 429 errors.
I don’t have much to contribute to this discussion, but it’s an old problem that Google has no intention of solving. Anyone who has a Gemini-based pay-as-you-go solution has the same problem. Google says switch to provisioning, it will be better, but I don’t know if that’s true. What I do is switch requests between all European centres to minimise the risk of 429. In case of an error, I repeat the query on another centre.
Thank you! I tried implementing this solution, but it doesn’t seem to solve the issue. It just return 429 on any regions unfortunately. Also the provisioning is so expensive for some reason? Starting from 1200$/month?
2025-11-25 08:39:15 info: [34vtgis] Attempt 1/7 using region: europe-west1
at async GeminiService.sendPromptWithFilesVertexAI (/app/dist/services/gemini.service.js:505:34)
2025-11-25 08:39:30 error: [34vtgis] Vertex AI error caught:
at async GeminiService.sendPromptWithFiles (/app/dist/services/gemini.service.js:267:20)
2025-11-25 08:39:30 error: [34vtgis] - Error name: ClientError
2025-11-25 08:39:30 error: [34vtgis] - Error message: [VertexAI.ClientError]: got status: 429 Too Many Requests. {"error":{"code":429,"message":"Resource exhausted. Please try again later. Please refer to
for more details.","status":"RESOURCE_EXHAUSTED"}}
2025-11-25 08:39:30 error: [34vtgis] - Error code: undefined
2025-11-25 08:39:30 error: [34vtgis] - Error status: undefined
2025-11-25 08:39:30 error: [34vtgis] - Stack trace:
ClientError: [VertexAI.ClientError]: got status: 429 Too Many Requests. {"error":{"code":429,"message":"Resource exhausted. Please try again later. Please refer to
for more details.","status":"RESOURCE_EXHAUSTED"}}
at throwErrorIfNotOK (/app/node_modules/@google-cloud/vertexai/build/src/functions/post_fetch_processing.js:32:27)
at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
at async generateContent (/app/node_modules/@google-cloud/vertexai/build/src/functions/generate_content.js:59:5)
"stack": "Error: Resource exhausted. Please try again later. Please refer to
for more details.\n at throwErrorIfNotOK (/app/node_modules/@google-cloud/vertexai/build/src/functions/post_fetch_processing.js:32:66)\n at process.processTicksAndRejections (node:internal/process/task_queues:105:5)\n at async generateContent (/app/node_modules/@google-cloud/vertexai/build/src/functions/generate_content.js:59:5)\n at async GeminiService.sendPromptWithFilesVertexAI (/app/dist/services/gemini.service.js:505:34)\n at async GeminiService.sendPromptWithFiles (/app/dist/services/gemini.service.js:267:20)\n at async ReportController.processDocuments (/app/dist/controllers/report.controller.js:884:40)\n at async /app/dist/controllers/report.controller.js:56:38",
at async ReportController.processDocuments (/app/dist/controllers/report.controller.js:884:40)
at async /app/dist/controllers/report.controller.js:56:38
2025-11-25 08:39:30 error: [34vtgis] - Error cause: {"code":429,"status":"RESOURCE_EXHAUSTED"}
2025-11-25 08:39:30 error: [34vtgis] - Full error object: {
"stack": "ClientError: [VertexAI.ClientError]: got status: 429 Too Many Requests. {\"error\":{\"code\":429,\"message\":\"Resource exhausted. Please try again later. Please refer to
for more details.\",\"status\":\"RESOURCE_EXHAUSTED\"}}\n at throwErrorIfNotOK (/app/node_modules/@google-cloud/vertexai/build/src/functions/post_fetch_processing.js:32:27)\n at process.processTicksAndRejections (node:internal/process/task_queues:105:5)\n at async generateContent (/app/node_modules/@google-cloud/vertexai/build/src/functions/generate_content.js:59:5)\n at async GeminiService.sendPromptWithFilesVertexAI (/app/dist/services/gemini.service.js:505:34)\n at async GeminiService.sendPromptWithFiles (/app/dist/services/gemini.service.js:267:20)\n at async ReportController.processDocuments (/app/dist/controllers/report.controller.js:884:40)\n at async /app/dist/controllers/report.controller.js:56:38",
"message": "[VertexAI.ClientError]: got status: 429 Too Many Requests. {\"error\":{\"code\":429,\"message\":\"Resource exhausted. Please try again later. Please refer to
for more details.\",\"status\":\"RESOURCE_EXHAUSTED\"}}",
"cause": {
"message": "Resource exhausted. Please try again later. Please refer to
for more details.",
"name": "Error"
},
"name": "ClientError",
"stackTrace": {
"stack": "Error: Resource exhausted. Please try again later. Please refer to
for more details.\n at throwErrorIfNotOK (/app/node_modules/@google-cloud/vertexai/build/src/functions/post_fetch_processing.js:32:66)\n at process.processTicksAndRejections (node:internal/process/task_queues:105:5)\n at async generateContent (/app/node_modules/@google-cloud/vertexai/build/src/functions/generate_content.js:59:5)\n at async GeminiService.sendPromptWithFilesVertexAI (/app/dist/services/gemini.service.js:505:34)\n at async GeminiService.sendPromptWithFiles (/app/dist/services/gemini.service.js:267:20)\n at async ReportController.processDocuments (/app/dist/controllers/report.controller.js:884:40)\n at async /app/dist/controllers/report.controller.js:56:38",
"message": "Resource exhausted. Please try again later. Please refer to
for more details.",
"name": "Error"
}
}
2025-11-25 08:39:30 warn: [34vtgis] Vertex AI overloaded in europe-west1 (attempt 1/7). Switching to europe-west4 in 20s...
2025-11-25 08:39:50 info: [34vtgis] Attempt 2/7 using region: europe-west4
2025-11-25 08:40:04 error: [34vtgis] Vertex AI error caught:
2025-11-25 08:40:04 error: [34vtgis] - Error name: ClientError
2025-11-25 08:40:04 error: [34vtgis] - Error message: [VertexAI.ClientError]: got status: 429 Too Many Requests. {"error":{"code":429,"message":"Resource exhausted. Please try again later. Please refer to
for more details.","status":"RESOURCE_EXHAUSTED"}}
And so on for all the regions that gemini 2.5 pro handles. I don’t know if maybe the implicit cache hits so it immediately returns the error.
Hi guys! I’ve been having similar problems. This is the post. I don’t know if you are using too many requests form the same IP (maybe more than one PC from the same LAN or VPN), but I have a feeling that it might be some kind of temp IP blocking due to abuse counter measures.
I have the Google AI Pro monthly membership. The n8n workflow is no longer working due to this error. It’s used to generate prompts, images, and videos.
In my original post, you can find all the answers to these questions. I sort of worked around the issue by implementing a retry mechanism across different regions, so I assume the error occurs when a specific region is temporarily exhausted. However, the error is still quite annoying, and I continue to experience it.
I’m using the Vertex AI SDK to process a few PDFs in a specific format. Paid tier.
Billing Tier: Paid (billing account in good standing)
Model Name & Region: gemini-2.5-flash in us-central1. Also tested us-east1 - same 429 error. Switching to global endpoint resolved the issue.
Platform: Vertex AI SDK (@google/genai npm package v1.x with vertexai: true)
Task Description: Audio transcription application - transcribing audio dictations (20-60 second WebM audio files) into structured outputs.
Additional Context:
Was working fine earlier in the day
The 429 errors persisted for 15+ minutes despite exponential backoff retry logic (5 attempts, 2s base delay)
GCP Console showed only 0.02% quota utilization on “Generate content requests with audio input per minute” (656/3,456,000)
Simple text-only requests (no audio) also returned 429 in us-central1
No incidents reported during the timeframe
Switching from us-central1 to global endpoint immediately resolved the issue
Occurred on January 20, 2026 between approximately 13:30-14:30 PST
This appears to be a regional capacity issue rather than a quota issue, as the quota dashboard showed minimal usage while requests were consistently rejected.
Yesterday was a disaster for a while, all regions were exhausted. Also the global endpoint returned the same error to me during retries in several occasions.
@Mahesh_Sutar I DM’d you with the requested information.