Incorrect 429 Error being returned for the Gemini asyncBatchEmbedContent endpoint

I’m on Gemini Tier 1 and I’ve been trying to utilize the gemini-embedding-001 model in my project which requires me to embed large amounts of text. I’ve been constantly getting rate limited and I don’t know what’s causing it. Nowhere in the docs can I find any limits mentioned for the gemini-embedding-001 model for the Gemini Batch API.

I checked both the Google Cloud Console and AI Studio - nowhere does it say I’m being limited. AI Studio tells me I’m at 2/3K RPM and 132/1M TPM for gemini-embedding-001, but I’m guessing this is different from the batch async endpoint I’m trying to utilize (asyncBatchEmbedContent), which is supposed to have much higher limits.

What I’ve observed:

  1. First request of the day gets 429: After not making any requests for 14+ hours, with 0 pending batch jobs, the very first request to asyncBatchEmbedContent returned 429 Too Many Requests.

  2. Inconsistent token limits: A request with ~245,000 tokens (500 chunks) got 429. After reducing to ~131,000 tokens (300 chunks), it passed. But subsequent requests with ~113,000 tokens still got 429. (I estimated tokens by dividing by 4, which isn’t exact but close enough).

  3. Appears to be both request-count based and token-based: After 1-2 successful batch job creations, subsequent requests get 429 regardless of token count. The limit seems to reset after ~15-20 minutes.

  4. Google Batch API returns NO indication about the error, there’s no Retry-After or anything similar in the headers.

2026-01-13 15:56:06,672 - INFO - Starting the pipeline…
2026-01-13 15:56:08,480 - INFO - Found 500 chunks to embed
2026-01-13 15:56:08,480 - INFO - Splitting into 1 batch(es) of up to 500 chunks each
2026-01-13 15:56:08,610 - INFO - Batch stats: 500 chunks, 980,121 chars, ~245,030 tokens (estimated)
2026-01-13 15:56:09,226 - INFO - HTTP Request: POST https.://generativelanguage.googleapis.com/upload/v1beta/files “HTTP/1.1 200 OK”
2026-01-13 15:56:11,017 - INFO - File uploaded: https.://generativelanguage.googleapis.com/v1beta/files/[FILE_ID_REDACTED]
2026-01-13 15:56:11,017 - INFO - Single batch prepared: 500 chunks, ~245,030 tokens
2026-01-13 15:56:11,017 - INFO - Creating batch embedding job…
2026-01-13 15:56:11,017 - INFO - Using resource name: files/[FILE_ID_REDACTED]

ExperimentalWarning: batches.create_embeddings() is experimental and may change without notice.
job = self.client.batches.create_embeddings(…)

2026-01-13 15:56:12,095 - INFO - HTTP Request: POST https.://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:asyncBatchEmbedContent “HTTP/1.1 429 Too Many Requests”
2026-01-13 15:56:12,098 - WARNING - Rate limited on create embedding batch (attempt 1/11). Using backoff: 16.6s
2026-01-13 15:56:29,914 - INFO - HTTP Request: POST https.://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:asyncBatchEmbedContent “HTTP/1.1 429 Too Many Requests”
2026-01-13 15:56:29,917 - WARNING - Rate limited on create embedding batch (attempt 2/11). Using backoff: 59.5s
2026-01-13 15:57:30,636 - INFO - HTTP Request: POST https.://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:asyncBatchEmbedContent “HTTP/1.1 429 Too Many Requests”
2026-01-13 15:57:30,637 - WARNING - Rate limited on create embedding batch (attempt 3/11). Using backoff: 113.3s
2026-01-13 15:59:25,132 - INFO - HTTP Request: POST https.://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:asyncBatchEmbedContent “HTTP/1.1 429 Too Many Requests”
2026-01-13 15:59:25,199 - WARNING - Rate limited on create embedding batch (attempt 4/11). Using backoff: 139.7s

**
Second logs:**

2026-01-13 16:06:54,881 - INFO - Found 500 chunks to embed
2026-01-13 16:06:54,881 - INFO - Splitting into 2 batch(es) of up to 300 chunks each
2026-01-13 16:06:54,977 - INFO - Batch submission plan: 2 batches, 30s delay between submissions
2026-01-13 16:06:55,087 - INFO - Batch stats: 300 chunks, 527,051 chars, ~131,762 tokens (estimated)
2026-01-13 16:06:59,148 - INFO - File uploaded: https.://generativelanguage.googleapis.com/v1beta/files/[FILE_ID_1]
2026-01-13 16:06:59,149 - INFO - Creating batch embedding job…
2026-01-13 16:07:01,003 - INFO - HTTP Request: POST https.://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:asyncBatchEmbedContent “HTTP/1.1 200 OK”
2026-01-13 16:07:01,004 - INFO - Batch embedding job created: batches/[BATCH_ID_1]
2026-01-13 16:07:01,142 - INFO - Child batch 1/2 submitted (~131,762 tokens)
2026-01-13 16:07:01,142 - INFO - Waiting 30s before submitting batch 2/2…

2026-01-13 16:07:31,296 - INFO - Preparing embedding batch for 200 chunks
2026-01-13 16:07:31,308 - INFO - Batch stats: 200 chunks, 453,070 chars, ~113,267 tokens (estimated)
2026-01-13 16:07:34,019 - INFO - File uploaded: https.://generativelanguage.googleapis.com/v1beta/files/[FILE_ID_2]
2026-01-13 16:07:34,019 - INFO - Creating batch embedding job…
2026-01-13 16:07:34,859 - INFO - HTTP Request: POST https.://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:asyncBatchEmbedContent “HTTP/1.1 429 Too Many Requests”
2026-01-13 16:07:34,860 - WARNING - Rate limited on create embedding batch (attempt 1/11). Using backoff: 20.8s
2026-01-13 16:07:56,811 - INFO - HTTP Request: POST https.://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:asyncBatchEmbedContent “HTTP/1.1 429 Too Many Requests”
2026-01-13 16:07:56,813 - WARNING - Rate limited on create embedding batch (attempt 2/11). Using backoff: 46.8s
2026-01-13 16:08:45,488 - INFO - HTTP Request: POST https.://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:asyncBatchEmbedContent “HTTP/1.1 200 OK”
2026-01-13 16:08:45,491 - INFO - Batch embedding job created: batches/[BATCH_ID_2]
2026-01-13 16:08:45,745 - INFO - Child batch 2/2 submitted (~113,267 tokens)

I’ve been searching for solution for days at this point and I couldn’t find this anywhere. I’ve also requested rate limit increases few weeks ago and got no response. Can anyone please help me / let me know what’s causing this and how can I fix? It is completely unusable at this point.

Hi @Ebrin , Welcome to forum!!!

Thanks for reaching out to us. Apologies for the inconvenience caused, Could you please share the project number (not the project ID) via direct message along with the tier details?

Thanks

I’ve sent you a dm 6 days ago and still got no response, is there anyone else that can help me out?!