RESOURCE_EXHAUSTED when use gemini-1.5-pro-002

Dima · September 30, 2024, 11:26am

When I try to use gemini-1.5-pro-002 with promt where more than 20 thousand tokens are used via the api using Curl/Python (in both cases the error is the same), it throws the error RESOURCE_EXHAUSTED, even if the limits are not reached.
But when I use gemini-1.5-pro-exp-0827, everything works.
gemini-1.5-pro-latest and gemini-1.5-pro also do not work

klinok64 · September 30, 2024, 11:45am

In some cases, this happens with 0827 too. Once I uploaded a file with the size of 1M tokens AI studio returned the error “You are reaching your limit”. I have no idea why this is happening. Maybe too many tokens - bad, idk…

Also wth are you writing in your prompt? The book? XD

afirstenberg · September 30, 2024, 1:04pm

Welcome to the forums!

Can you elaborate on what, exactly, you mean by this, and how you’re determining if you’ve reached “the limits” or not?

It isn’t unusual to encounter error 429 on occasion, and you should implement an incremental backoff for such cases. If you’re routinely hitting this every request, there may be other issues we shoudl look into.

Dima · September 30, 2024, 1:15pm

Thank you, I mean I haven’t used this api key for more than 24 hours, and when I tried again I waited for more than 1 minute

Dima · September 30, 2024, 1:23pm

Yeah, the request itself and its past responses, so that he remembers how to respond, with experimental models I used up to 1.5 million tokens and everything works

Dima · September 30, 2024, 3:36pm

Here is an example, I send a request via Curl to the gemini-1.5-pro-002 model, and almost immediately I get the error 429 RESOURCE_EXHAUSTED

Url: https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-pro-002:generateContent?key=ThereWasApi
Status: 429

vary: X-Origin
vary: Referer
vary: Origin,Accept-Encoding
content-type: application/json; charset=UTF-8
date: Mon, 30 Sep 2024 15:28:04 GMT
server: scaffolding on HTTPServer2
cache-control: private
x-xss-protection: 0
x-frame-options: SAMEORIGIN
x-content-type-options: nosniff
server-timing: gfet4t7; dur=981
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
accept-ranges: none

{
  "error": {
    "code": 429,
    "message": "Resource has been exhausted (e.g. check quota).",
    "status": "RESOURCE_EXHAUSTED"
  }
}

Then, after a couple of seconds, I send exactly the same request, but for the gemini-1.5-pro-exp-0827 model, and after a while I get the answer I need.

Url: https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-pro-exp-0827:generateContent?key=ThereWasApi
Status: 200

content-type: application/json; charset=UTF-8
vary: X-Origin
vary: Referer
vary: Origin,Accept-Encoding
date: Mon, 30 Sep 2024 15:29:46 GMT
server: scaffolding on HTTPServer2
cache-control: private
x-xss-protection: 0
x-frame-options: SAMEORIGIN
x-content-type-options: nosniff
server-timing: gfet4t7; dur=62146
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
accept-ranges: none

{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": "Response"
          }
        ],
        "role": "model"
      },
      "finishReason": "STOP",
      "index": 0,
      "safetyRatings": [
        {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_HATE_SPEECH",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_HARASSMENT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability": "NEGLIGIBLE"
        }
      ]
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 37669,
    "candidatesTokenCount": 1767,
    "totalTokenCount": 39436
  }
}

Here are two of these queries

afirstenberg · September 30, 2024, 4:13pm

Can you take a look at the Quotas & System Limits page for your project and see if it reports what your quota usage has been?

Also seeing the “Traffic by Response Code” graph on the Metrics page might give some insight as well.

yan-hic · October 1, 2024, 5:22pm

Check the quotas indeed - I found out that e.g. gemini-1.5-flash-8b-exp... has a limit of 15 rpm for a project. Because it’s experimental, G didn’t put much power to it I guess.
The 1.5-pro-exp has a limit of 2 rpm (!)

The 1.5-flash has 2000 but it’s misleading because there is another quota, rpm per region that limits it down to 1500.

Always found these quotas difficult to follow, and no tool to help list all relevant quotas for a given API call.

Dima · October 2, 2024, 2:53am

I followed these links, and noticed that for some reason at first I have all limits of 0

I don’t understand why

Url: https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-pro-002:generateContent?key=MyApi
Status: 429

vary: X-Origin
vary: Referer
vary: Origin,Accept-Encoding
content-type: application/json; charset=UTF-8
date: Wed, 02 Oct 2024 02:40:54 GMT
server: scaffolding on HTTPServer2
cache-control: private
x-xss-protection: 0
x-frame-options: SAMEORIGIN
x-content-type-options: nosniff
server-timing: gfet4t7; dur=1254
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
accept-ranges: none

{
  "error": {
    "code": 429,
    "message": "Quota exceeded for quota metric 'Generate Content API requests per minute' and limit 'GenerateContent request limit per minute for a region' of service 'generativelanguage.googleapis.com' for consumer 'project_number:1062870507181'.",
    "status": "RESOURCE_EXHAUSTED",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.ErrorInfo",
        "reason": "RATE_LIMIT_EXCEEDED",
        "domain": "googleapis.com",
        "metadata": {
          "service": "generativelanguage.googleapis.com",
          "consumer": "projects/1062870507181",
          "quota_limit": "GenerateContentRequestsPerMinutePerProjectPerRegion",
          "quota_metric": "generativelanguage.googleapis.com/generate_content_requests",
          "quota_location": "us-east2",
          "quota_limit_value": "0"
        }
      },
      {
        "@type": "type.googleapis.com/google.rpc.Help",
        "links": [
          {
            "description": "Request a higher quota limit.",
            "url": "https://cloud.google.com/docs/quotas/help/request_increase"
          }
        ]
      }
    ]
  }
}

Topic		Replies	Views
Why always getting Status 429? Very frustrating Gemini API	18	3759	August 10, 2024
429 Errors on Large Prompt Gemini API	8	417	August 4, 2024
429 Quota exceeded for quota metric 'Generate Content API requests per minute' Gemini API bug , api	3	1049	May 13, 2025
400, 500 and 503 since morning Gemini API bug , api , models , rate-limits	23	447	October 6, 2025
Gemini API Errors Gemini API api	10	626	June 30, 2025

RESOURCE_EXHAUSTED when use gemini-1.5-pro-002

Related topics