429 Too Many Requests on Vertex AI API generateContent (Gemini 2.5 Pro)

frmusso · October 31, 2025, 2:17pm

I’m using the Vertex AI API to send several generateContent requests to the Gemini 2.5 Pro default model in the europe-west1 region (cannot switch to global region as per country laws on the data I’m handling). Occasionally, at seemingly random times during the day, I receive a 429 Too Many Requests error. The specific message is:

{"error":{"code":429,"message":"Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429` for more details.",“status”:“RESOURCE_EXHAUSTED”}}`

I don’t believe I’m hitting any quota limits, as the issue occurs randomly, without any significant load or concurrent requests. Sometimes the 429 error appears several times in a row—four or five, or even more—and then suddenly everything goes back to normal.
I’m not sure whether this is a region-specific issue or something related to my account temporarily exhausting resources.

I have also implemented a retry mechanism with exponential backoff between attempts, up to four retries, but all of them still resulted in 429 errors.

paulvancotthem · October 31, 2025, 2:42pm

I have had the same and in my case I found it was due to a malformed prompt payload to the API.

(Your mileage may vary).

Piotr_Gajda · November 6, 2025, 8:22am

I don’t have much to contribute to this discussion, but it’s an old problem that Google has no intention of solving. Anyone who has a Gemini-based pay-as-you-go solution has the same problem. Google says switch to provisioning, it will be better, but I don’t know if that’s true. What I do is switch requests between all European centres to minimise the risk of 429. In case of an error, I repeat the query on another centre.

frmusso · November 24, 2025, 3:54pm

Thank you! I tried implementing this solution, but it doesn’t seem to solve the issue. It just return 429 on any regions unfortunately. Also the provisioning is so expensive for some reason? Starting from 1200$/month?

Piotr_Gajda · November 25, 2025, 7:02am

It should not. Check the list of locations. When you get 429 in one location then switch the request to another. It works for me.

frmusso · November 25, 2025, 8:48am

I switch regions, and these are my logs:

2025-11-25 08:39:15 info: [34vtgis] Attempt 1/7 using region: europe-west1

    at async GeminiService.sendPromptWithFilesVertexAI (/app/dist/services/gemini.service.js:505:34)

2025-11-25 08:39:30 error: [34vtgis] Vertex AI error caught:

at async GeminiService.sendPromptWithFiles (/app/dist/services/gemini.service.js:267:20)

2025-11-25 08:39:30 error: [34vtgis]   - Error name: ClientError

2025-11-25 08:39:30 error: [34vtgis]   - Error message: [VertexAI.ClientError]: got status: 429 Too Many Requests. {"error":{"code":429,"message":"Resource exhausted. Please try again later. Please refer to 
 for more details.","status":"RESOURCE_EXHAUSTED"}}

2025-11-25 08:39:30 error: [34vtgis]   - Error code: undefined

2025-11-25 08:39:30 error: [34vtgis]   - Error status: undefined

2025-11-25 08:39:30 error: [34vtgis]   - Stack trace:

ClientError: [VertexAI.ClientError]: got status: 429 Too Many Requests. {"error":{"code":429,"message":"Resource exhausted. Please try again later. Please refer to 
 for more details.","status":"RESOURCE_EXHAUSTED"}}

at throwErrorIfNotOK (/app/node_modules/@google-cloud/vertexai/build/src/functions/post_fetch_processing.js:32:27)

at process.processTicksAndRejections (node:internal/process/task_queues:105:5)

at async generateContent (/app/node_modules/@google-cloud/vertexai/build/src/functions/generate_content.js:59:5)

"stack": "Error: Resource exhausted. Please try again later. Please refer to 
 for more details.\n    at throwErrorIfNotOK (/app/node_modules/@google-cloud/vertexai/build/src/functions/post_fetch_processing.js:32:66)\n    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)\n    at async generateContent (/app/node_modules/@google-cloud/vertexai/build/src/functions/generate_content.js:59:5)\n    at async GeminiService.sendPromptWithFilesVertexAI (/app/dist/services/gemini.service.js:505:34)\n    at async GeminiService.sendPromptWithFiles (/app/dist/services/gemini.service.js:267:20)\n    at async ReportController.processDocuments (/app/dist/controllers/report.controller.js:884:40)\n    at async /app/dist/controllers/report.controller.js:56:38",

at async ReportController.processDocuments (/app/dist/controllers/report.controller.js:884:40)

at async /app/dist/controllers/report.controller.js:56:38

2025-11-25 08:39:30 error: [34vtgis]   - Error cause: {"code":429,"status":"RESOURCE_EXHAUSTED"}

2025-11-25 08:39:30 error: [34vtgis]   - Full error object: {

"stack": "ClientError: [VertexAI.ClientError]: got status: 429 Too Many Requests. {\"error\":{\"code\":429,\"message\":\"Resource exhausted. Please try again later. Please refer to 
 for more details.\",\"status\":\"RESOURCE_EXHAUSTED\"}}\n    at throwErrorIfNotOK (/app/node_modules/@google-cloud/vertexai/build/src/functions/post_fetch_processing.js:32:27)\n    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)\n    at async generateContent (/app/node_modules/@google-cloud/vertexai/build/src/functions/generate_content.js:59:5)\n    at async GeminiService.sendPromptWithFilesVertexAI (/app/dist/services/gemini.service.js:505:34)\n    at async GeminiService.sendPromptWithFiles (/app/dist/services/gemini.service.js:267:20)\n    at async ReportController.processDocuments (/app/dist/controllers/report.controller.js:884:40)\n    at async /app/dist/controllers/report.controller.js:56:38",

"message": "[VertexAI.ClientError]: got status: 429 Too Many Requests. {\"error\":{\"code\":429,\"message\":\"Resource exhausted. Please try again later. Please refer to 
 for more details.\",\"status\":\"RESOURCE_EXHAUSTED\"}}",

"cause": {

"message": "Resource exhausted. Please try again later. Please refer to 
 for more details.",

"name": "Error"

  },

"name": "ClientError",

"stackTrace": {

"stack": "Error: Resource exhausted. Please try again later. Please refer to 
 for more details.\n    at throwErrorIfNotOK (/app/node_modules/@google-cloud/vertexai/build/src/functions/post_fetch_processing.js:32:66)\n    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)\n    at async generateContent (/app/node_modules/@google-cloud/vertexai/build/src/functions/generate_content.js:59:5)\n    at async GeminiService.sendPromptWithFilesVertexAI (/app/dist/services/gemini.service.js:505:34)\n    at async GeminiService.sendPromptWithFiles (/app/dist/services/gemini.service.js:267:20)\n    at async ReportController.processDocuments (/app/dist/controllers/report.controller.js:884:40)\n    at async /app/dist/controllers/report.controller.js:56:38",

"message": "Resource exhausted. Please try again later. Please refer to 
 for more details.",

"name": "Error"

  }

}

2025-11-25 08:39:30 warn: [34vtgis] Vertex AI overloaded in europe-west1 (attempt 1/7). Switching to europe-west4 in 20s...

2025-11-25 08:39:50 info: [34vtgis] Attempt 2/7 using region: europe-west4

2025-11-25 08:40:04 error: [34vtgis] Vertex AI error caught:

2025-11-25 08:40:04 error: [34vtgis]   - Error name: ClientError

2025-11-25 08:40:04 error: [34vtgis]   - Error message: [VertexAI.ClientError]: got status: 429 Too Many Requests. {"error":{"code":429,"message":"Resource exhausted. Please try again later. Please refer to 
 for more details.","status":"RESOURCE_EXHAUSTED"}}

And so on for all the regions that gemini 2.5 pro handles. I don’t know if maybe the implicit cache hits so it immediately returns the error.

Piotr_Gajda · November 25, 2025, 10:35am

Can you check if you get the same with Gemini 2.5 Flash? My solution uses this model.

Cristobal · November 25, 2025, 12:23pm

Hi guys! I’ve been having similar problems. This is the post. I don’t know if you are using too many requests form the same IP (maybe more than one PC from the same LAN or VPN), but I have a feeling that it might be some kind of temp IP blocking due to abuse counter measures.

Mahesh_Sutar · December 29, 2025, 8:22am

Hello @frmusso , @paulvancotthem , @Piotr_Gajda , @Cristobal

I understand your frustration with this issue. Could you please provide the following details to help us investigate?

Complete Error Message: The full JSON response.
Billing Tier: (e.g., Free, paid etc.
Model Name & Region: (e.g., gemini-2.5-pro in europe-west1 etc).
Platform: (AI Studio, Vertex AI SDK, or REST ).
Task Description: Briefly, what are you using the API for? (e.g., "summarizing large PDFs"etc).

also let me know are you still facing this issue?

Thanks!

Never_Finished_Never · January 8, 2026, 1:16am

Jan 7/2026. I also have the same error 429

{ “error”: { “code”: 429, “message”: "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/usage?tab=rate-limit. ", “status”: “RESOURCE_EXHAUSTED” } }

I have the Google AI Pro monthly membership. The n8n workflow is no longer working due to this error. It’s used to generate prompts, images, and videos.

Mahesh_Sutar · January 8, 2026, 9:57am

Hello @Never_Finished_Never ,

Please check your usage and billing on AI studio Dashboard ,if you have not exhausted please share details mentioned below:

Complete Error Message: The full JSON response.
Billing Tier: (e.g., Free, paid etc.
Model Name & Region: (e.g., gemini-2.5-pro in europe-west1 etc).
Platform: (AI Studio, Vertex AI SDK, or REST ).
Task Description: Briefly, what are you using the API for? (e.g., "summarizing large PDFs"etc).
provide the project number (not the project ID) via direct message

frmusso · January 8, 2026, 10:14am

In my original post, you can find all the answers to these questions. I sort of worked around the issue by implementing a retry mechanism across different regions, so I assume the error occurs when a specific region is temporarily exhausted. However, the error is still quite annoying, and I continue to experience it.

I’m using the Vertex AI SDK to process a few PDFs in a specific format. Paid tier.

Mahesh_Sutar · January 8, 2026, 2:25pm

Hello @frmusso ,

Could you please DM me your project number(not the project ID)?

Brian_Pridgen · January 20, 2026, 10:38pm

@Mahesh_Sutar , I am having the same exact issue. Here is the information you requested from another user:

Complete Error Message:
{“error”:{“code”:429,“message”:“Resource exhausted. Please try again later. Please refer to
Error code 429 | Generative AI on Vertex AI | Google Cloud Documentation for more details.”,“status”:“RESOURCE_EXHAUSTED”}}

Billing Tier: Paid (billing account in good standing)

Model Name & Region: gemini-2.5-flash in us-central1. Also tested us-east1 - same 429 error. Switching to global endpoint resolved the issue.

Platform: Vertex AI SDK (@google/genai npm package v1.x with vertexai: true)

Task Description: Audio transcription application - transcribing audio dictations (20-60 second WebM audio files) into structured outputs.

Additional Context:

Was working fine earlier in the day
The 429 errors persisted for 15+ minutes despite exponential backoff retry logic (5 attempts, 2s base delay)
GCP Console showed only 0.02% quota utilization on “Generate content requests with audio input per minute” (656/3,456,000)
Simple text-only requests (no audio) also returned 429 in us-central1
No incidents reported during the timeframe
Switching from us-central1 to global endpoint immediately resolved the issue
Occurred on January 20, 2026 between approximately 13:30-14:30 PST

This appears to be a regional capacity issue rather than a quota issue, as the quota dashboard showed minimal usage while requests were consistently rejected.

frmusso · January 21, 2026, 3:48pm

Yesterday was a disaster for a while, all regions were exhausted. Also the global endpoint returned the same error to me during retries in several occasions.

@Mahesh_Sutar I DM’d you with the requested information.

yj_s · January 22, 2026, 5:27am

I’m also experiencing a sudden spike in 429 errors recently with simple generateContentStream requests. Could you please investigate this?

frmusso · January 23, 2026, 3:23pm

We’ve been experiencing critical issues with the Vertex AI API throughout the entire day today, and this is having serious consequences for our production environment.

Multiple user complaints received
One client has already terminated their contract with us due to service unavailability
Ongoing trust issues with remaining customers

Google markets this as production-ready software, yet we’re unable to build reliable production systems on top of it. It’s been months. I’ve also reached the Google technical support which were unable to provide a practical solution rather than buying the provisoned throughput which is way out of reach.

frmusso · January 27, 2026, 2:49pm

Reference to the recent Known Issue that appeared on Google Cloud Support page: I noticed the alert has disappeared, I guess implying the incident is resolved.

However, we are still seeing a high error rate. It is slightly better than before (failing around 80% of the time instead of 100%), but the issue is definitely not resolved on our end. Can you please investigate further?

Topic		Replies	Views
Sudden Spike in 429 Errors with Gemini 2.5 via Vertex AI Global Endpoint Gemini API vertexai , gemini	7	1219	April 8, 2026
Gemini-2.5-flash-image: Frequent 429 RESOURCE_EXHAUSTED during sequential image generation - seeking clarity on rate limits Gemini API gemini-api , vertex-ai , rate-limits , image-generation	7	809	February 10, 2026
Issue with 429 Error on Gemini API Despite Staying Within Rate Limits Gemini API gemini-api	14	1947	June 5, 2026
Error 429 for absolutely no reason Google AI Studio ai-studio , api , gemini	11	341	December 22, 2025
429s in Vertex AI for Gemini-2.5-Flash-Lite in Europe Gemini API bug , vertexai , gemini-2-5 , gemini-flash-2-5	9	537	March 30, 2026

429 Too Many Requests on Vertex AI API generateContent (Gemini 2.5 Pro)

Related topics