Gemini Embedding API: Encountering "Model operations request limit per minute for a region" 429 Error - RPM Limit Confusion

Hello Developers,

I’m currently encountering 429 errors when using the Gemini-Embedding-001 model. The specific error message is as follows:

{'error': {'code': 429, 'message': "Quota exceeded for quota metric 'Read API requests' and limit 'Model operations request limit per minute for a region' of service 'generativelanguage.googleapis.com' for consumer 'project_number:327036061011'.", 'status': 'RESOURCE_EXHAUSTED', 'details': [{'@type': 'type.googleapis.com/google.rpc.ErrorInfo', 'reason': 'RATE_LIMIT_EXCEEDED', 'domain': 'googleapis.com', 'metadata': {'quota_location': 'us-south1', 'quota_limit_value': '200', 'quota_unit': '1/min/{project}/{region}', 'service': 'generativelanguage.googleapis.com', 'consumer': 'projects/327036061011', 'quota_metric': 'generativelanguage.googleapis.com/model_requests', 'quota_limit': 'ModelRequestsPerMinutePerProjectPerRegion'}}, {'@type': 'type.googleapis.com/google.rpc.Help', 'links': [{'description': 'Request a higher quota limit.', 'url': 'https://cloud.google.com/docs/quotas/help/request_increase'}]

I’ve noticed that while my Gemini API account is a Tier 1 paid account, and the documentation suggests an RPM (Requests Per Minute) of 3000 for Embedding models, I’m actually hitting a limit of “Model operations request limit per minute for a region” with a value of 200 RPM.

This is quite confusing. If this regional RPM limit of 200 is in place, how can we achieve the true 3000 RPM rate for Gemini Embedding models to support large-scale applications?

I’ve also tried upgrading to a Tier 2 account, but I’m still encountering the same 429 error.

Has anyone else experienced similar issues? Or is there an official explanation on how to overcome this regional RPM limit when using Embedding models at scale, to fully leverage the higher limits of paid accounts?

Any advice or guidance would be greatly appreciated!

1 Like

Hi @yz_xiaolu,

Welcome to the Forum,

Thank you for bringing this to our attention. We appreciate you flagging this issue and will report it to the internal team.

Thank you!

Hey @chunduriv do you have any updates about this issue? I’ve keep hitting the error 429 on tier 1 plan and I’ve sent only 4 batch requests today.

Can you at least publish the rate limits so we can know and plan accordingly?