429 errors despite waiting after `retryDelay`

KobaDev · August 6, 2025, 5:48am

Hello,

I am trying to handle retries for API requests to the Gemini API using the tenacity. However, I’m encountering an issue, particularly with 429 (rate limit) errors.

My current retry logic is intended to:

Parse the retryDelay value from the APIError details.
Add 5 seconds to this retryDelay.
Combine this with an exponential backoff strategy (wait_exponential).

The waiting logic appears to be functioning correctly, but the 429 errors persist despite multiple retries.

Here is a snippet of my code:

# ... (imports and other setup)

def _is_retriable(e: BaseException) -> bool:
    return isinstance(e, APIError) and e.code in [503, 429]

def _calc_retry_delay(exception: BaseException | None) -> float:
    # ... (logic to parse retryDelay and add 5 seconds)
    if isinstance(exception, APIError) and exception.code == 429:
        try:
            retry_delay = parse(
                [
                    rd
                    for d in exception.details["error"]["details"]
                    if (rd := d.get("retryDelay"))
                ][0]
            )
            if retry_delay:
                return retry_delay + 5
        except (IndexError, KeyError):
            return 60
    return 60

class wait_from_exception(wait_base):
    def __call__(self, retry_state: "RetryCallState") -> float:
        if retry_state.outcome is None:
            return 0
        exception = retry_state.outcome.exception()
        return _calc_retry_delay(exception)

@retry(
    wait=wait_combine(wait_from_exception(), wait_exponential(multiplier=2, min=10, max=300)),
    retry=retry_if_exception(_is_retriable),
    stop=stop_after_attempt(10),
)
async def make_summary(issue_id: str, db: Session):
    # ... (API call)

What could be the reason for this? Am I misunderstanding how to correctly handle the retryDelay from the API response? Any insights or suggestions would be greatly appreciated.

Thank you.

Pannaga_J · August 6, 2025, 6:05am

Hi @KobaDev
429 error means you are sending too many requests per minute with the free tier Gemini API.
Verify that you’re within the model’s rate limit. Rate limits | Gemini API | Google AI for Developers if needed ask for quota increase.

You can also check your quota limit like this
Go to GCP console and click “APIs & Services”. Under Metric, search and select “Generative Language API”.. Under “Quotas & System Limits” tab, check for “Current Usage percentage”..

If it reaches 100%, then you have reached your quota limits and hence the 429 Error.

KobaDev · August 6, 2025, 7:54am

Hi @Pannaga_J

I received 429 response body like this:

{
   "error":{
      "code":429,
      "message":"You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.",
      "status":"RESOURCE_EXHAUSTED",
      "details":[
         {
            "@type":"type.googleapis.com/google.rpc.QuotaFailure",
            "violations":[
               {
                  "quotaMetric":"generativelanguage.googleapis.com/generate_content_free_tier_requests",
                  "quotaId":"GenerateRequestsPerDayPerProjectPerModel-FreeTier",
                  "quotaDimensions":{
                     "model":"gemini-2.5-flash",
                     "location":"global"
                  },
                  "quotaValue":"250"
               }
            ]
         },
         {
            "@type":"type.googleapis.com/google.rpc.Help",
            "links":[
               {
                  "description":"Learn more about Gemini API quotas",
                  "url":"https://ai.google.dev/gemini-api/docs/rate-limits"
               }
            ]
         },
         {
            "@type":"type.googleapis.com/google.rpc.RetryInfo",
            "retryDelay":"1s"
         }
      ]
   }
}

You mean this retryDelay is not useful for free tier user?

Pannaga_J · August 6, 2025, 9:14am

Based on the error message you provided is a quota exhaustion error, not a temporary rate-limiting error. While both can result in a 429 HTTP status code. This error is telling you that you have exceeded a specific daily quota: GenerateRequestsPerDayPerProjectPerModel-FreeTier, which has a limit of 250 requests. Check this out Rate limits | Gemini API | Google AI for Developers .
That’s the reason why retryDelay is not working for your usecase.

KobaDev · August 6, 2025, 9:54am

Thank you for the clarification.

I have two questions/suggestions regarding the API design for 429 errors:

The presence of RetryInfo with a non-applicable retryDelay value is misleading. Could you consider either removing RetryInfo for quota-related errors or providing a meaningful retryDelay value (e.g., the time until the quota resets)?
Is there a way for developers to programmatically distinguish between a temporary rate-limiting error and a quota exhaustion error in the current 429 response body?

Thanks,

Pannaga_J · August 6, 2025, 10:49am

Thank you for your suggestions.

Regarding your first point, it’s a good one and will discuss it with our internal team. We’ll update you as soon as possible.

For your second point, please check the JSON response you received. The error message should contain error.status for RESOURCE_EXHAUSTED and error.details with QuotaFailure, Quotavalue and quotaID objects which will help you distinguish .

Topic		Replies	Views
Why always getting Status 429? Very frustrating Gemini API	18	4009	August 10, 2024
Gemini API Errors Gemini API api	10	646	June 30, 2025
❌ ERROR Resource has been exhausted (e.g. check quota) Gemini1.5-pro Gemini API gemini-15 , models	2	633	August 11, 2024
Issue with 429 Error on Gemini API Despite Staying Within Rate Limits Gemini API gemini-api	7	835	June 23, 2025
Gemini API New Account Free Tier Immediately Returns 429 Error Gemini API gemini-api , api-key	5	429	October 27, 2025

429 errors despite waiting after `retryDelay`

Related topics