Hitting rate limit on Gemini Batch API for gemini-embedding-001

Hello,

I’m blocked by an issue with the batch embedding API (gemini-embedding-001).

On Tier 1, I was limited to 2 million tokens per job. Upgrading to Tier 2, removed this limit, but now I’m facing two new problems: a hard limit of ~15 million total tokens across all jobs and my jobs being stuck in running.

The actual limits I’m hitting are not shown in any GCP or AI Studio monitoring dashboards, making this impossible to debug.

I receive a 429 RESOURCE_EXHAUSTED error when I try to queue more jobs.

Is this 15M concurrent limit expected, and is there a known issue with batch jobs getting stuck and where can I see my ussage?

Thanks for your help.

Hello,

Welcome to the Forum!

We would like to clarify a small correction regarding the rate limits. For Tier 1, the limit is 3000 requests per minute (RPM), 1 million tokens per minute (TPM), and no limit on requests per day (RPD).

For Tier 2, the limits increase to 5000 RPM, 5 million TPM, while RPD remains unlimited.

Could you please try to check your usage in GCP and also share you complete error message with us?

Thank you for clarifying the limits!

In the GCP console I only see usage showing up under:

  • File storage quota per project — 21,474,836,480 (3.36 % used)

  • Request limit per minute for a region — europe-west1: 7,000 (0.14 % used, ~10 RPS)

The other quotas (including the EmbedContent batch token quotas) show no usage at all.

However, when I try to create a batch embedding job, it fails with:

{
  "error": {
    "code": 429,
    "message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.",
    "status": "RESOURCE_EXHAUSTED"
  }
}

Additionally, the jobs stay in Running but never seem to make progress
It is very slow.

Thanks again for your support!