I’m using the Vertex AI API to send several generateContent requests to the Gemini 2.5 Pro default model in the europe-west1 region (cannot switch to global region as per country laws on the data I’m handling). Occasionally, at seemingly random times during the day, I receive a 429 Too Many Requests error. The specific message is:
{"error":{"code":429,"message":"Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429` for more details.",“status”:“RESOURCE_EXHAUSTED”}}`
I don’t believe I’m hitting any quota limits, as the issue occurs randomly, without any significant load or concurrent requests. Sometimes the 429 error appears several times in a row—four or five, or even more—and then suddenly everything goes back to normal.
I’m not sure whether this is a region-specific issue or something related to my account temporarily exhausting resources.
I have also implemented a retry mechanism with exponential backoff between attempts, up to four retries, but all of them still resulted in 429 errors.