429 Too Many Requests on Vertex AI API generateContent (Gemini 2.5 Pro)

I’m using the Vertex AI API to send several generateContent requests to the Gemini 2.5 Pro default model in the europe-west1 region (cannot switch to global region as per country laws on the data I’m handling). Occasionally, at seemingly random times during the day, I receive a 429 Too Many Requests error. The specific message is:

{"error":{"code":429,"message":"Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429` for more details.",“status”:“RESOURCE_EXHAUSTED”}}`

I don’t believe I’m hitting any quota limits, as the issue occurs randomly, without any significant load or concurrent requests. Sometimes the 429 error appears several times in a row—four or five, or even more—and then suddenly everything goes back to normal.
I’m not sure whether this is a region-specific issue or something related to my account temporarily exhausting resources.

I have also implemented a retry mechanism with exponential backoff between attempts, up to four retries, but all of them still resulted in 429 errors.

I have had the same and in my case I found it was due to a malformed prompt payload to the API.

(Your mileage may vary).