Hello,
I am trying to handle retries for API requests to the Gemini API using the tenacity. However, I’m encountering an issue, particularly with 429
(rate limit) errors.
My current retry logic is intended to:
- Parse the
retryDelay
value from the APIError
details.
- Add 5 seconds to this
retryDelay
.
- Combine this with an exponential backoff strategy (
wait_exponential
).
The waiting logic appears to be functioning correctly, but the 429 errors persist despite multiple retries.
Here is a snippet of my code:
# ... (imports and other setup)
def _is_retriable(e: BaseException) -> bool:
return isinstance(e, APIError) and e.code in [503, 429]
def _calc_retry_delay(exception: BaseException | None) -> float:
# ... (logic to parse retryDelay and add 5 seconds)
if isinstance(exception, APIError) and exception.code == 429:
try:
retry_delay = parse(
[
rd
for d in exception.details["error"]["details"]
if (rd := d.get("retryDelay"))
][0]
)
if retry_delay:
return retry_delay + 5
except (IndexError, KeyError):
return 60
return 60
class wait_from_exception(wait_base):
def __call__(self, retry_state: "RetryCallState") -> float:
if retry_state.outcome is None:
return 0
exception = retry_state.outcome.exception()
return _calc_retry_delay(exception)
@retry(
wait=wait_combine(wait_from_exception(), wait_exponential(multiplier=2, min=10, max=300)),
retry=retry_if_exception(_is_retriable),
stop=stop_after_attempt(10),
)
async def make_summary(issue_id: str, db: Session):
# ... (API call)
What could be the reason for this? Am I misunderstanding how to correctly handle the retryDelay
from the API response? Any insights or suggestions would be greatly appreciated.
Thank you.
Hi @KobaDev
429 error means you are sending too many requests per minute with the free tier Gemini API.
Verify that you’re within the model’s rate limit. Rate limits | Gemini API | Google AI for Developers if needed ask for quota increase.
You can also check your quota limit like this
Go to GCP console and click “APIs & Services”. Under Metric, search and select “Generative Language API”.. Under “Quotas & System Limits” tab, check for “Current Usage percentage”..
If it reaches 100%, then you have reached your quota limits and hence the 429 Error.
Hi @Pannaga_J
I received 429 response body like this:
{
"error":{
"code":429,
"message":"You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.",
"status":"RESOURCE_EXHAUSTED",
"details":[
{
"@type":"type.googleapis.com/google.rpc.QuotaFailure",
"violations":[
{
"quotaMetric":"generativelanguage.googleapis.com/generate_content_free_tier_requests",
"quotaId":"GenerateRequestsPerDayPerProjectPerModel-FreeTier",
"quotaDimensions":{
"model":"gemini-2.5-flash",
"location":"global"
},
"quotaValue":"250"
}
]
},
{
"@type":"type.googleapis.com/google.rpc.Help",
"links":[
{
"description":"Learn more about Gemini API quotas",
"url":"https://ai.google.dev/gemini-api/docs/rate-limits"
}
]
},
{
"@type":"type.googleapis.com/google.rpc.RetryInfo",
"retryDelay":"1s"
}
]
}
}
You mean this retryDelay
is not useful for free tier user?
Based on the error message you provided is a quota exhaustion error, not a temporary rate-limiting error. While both can result in a 429
HTTP status code. This error is telling you that you have exceeded a specific daily quota: GenerateRequestsPerDayPerProjectPerModel-FreeTier
, which has a limit of 250 requests. Check this out Rate limits | Gemini API | Google AI for Developers .
That’s the reason why retryDelay is not working for your usecase.
Thank you for the clarification.
I have two questions/suggestions regarding the API design for 429
errors:
- The presence of
RetryInfo
with a non-applicable retryDelay
value is misleading. Could you consider either removing RetryInfo
for quota-related errors or providing a meaningful retryDelay
value (e.g., the time until the quota resets)?
- Is there a way for developers to programmatically distinguish between a temporary rate-limiting error and a quota exhaustion error in the current
429
response body?
Thanks,
Thank you for your suggestions.
Regarding your first point, it’s a good one and will discuss it with our internal team. We’ll update you as soon as possible.
For your second point, please check the JSON response you received. The error message should contain error.status for RESOURCE_EXHAUSTED and error.details with QuotaFailure, Quotavalue and quotaID objects which will help you distinguish .