Grounding with Google Search requests on gemini-2.5-flash are counted under "Gemini 3" rate limit category

Hi,

I’m experiencing an unexpected rate limit categorization issue with Grounding with Google Search on the Gemini API.

Environment

Issue

When I make generateContent requests with the google_search tool enabled using model gemini-2.5-flash, the grounding usage is being counted under the “Gemini 3”
category in the AI Studio Rate Limits dashboard — not under “Gemini 2.5”.

What I see in the Rate Limits dashboard:

┌────────────┬──────────────────┬──────────────────────────┐
│ Category │ Type │ Usage │
├────────────┼──────────────────┼──────────────────────────┤
│ Gemini 2.5 │ Search Grounding │ 793 / 5K │
├────────────┼──────────────────┼──────────────────────────┤
│ Gemini 3 │ Search Grounding │ 1,640 / 1,500 (exceeded) │
└────────────┴──────────────────┴──────────────────────────┘

What I expect:

All grounding requests should be counted under Gemini 2.5, since my code explicitly specifies gemini-2.5-flash as the model. I do not use any Gemini 3 model in my
codebase.

Code snippet

const url = https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=${apiKey};

const body = {
contents: [{ role: “user”, parts: [{ text: prompt }] }],
tools: [{ google_search: {} }],
};

const response = await fetch(url, {
method: “POST”,
headers: { “Content-Type”: “application/json” },
body: JSON.stringify(body),
});

Impact

  • The Gemini 3 grounding quota (1,500) has been exceeded, causing 429 RESOURCE_EXHAUSTED errors.
  • On Tier 3, the Gemini 2.5 grounding free tier is 1,500 RPD. If all requests were correctly categorized under Gemini 2.5, the combined total (~2,400) would still
    exceed the free tier but would not hit a hard rate limit at 1,500.
  • Additionally, this misclassification may result in incorrect billing, as Gemini 3 uses per-query billing while Gemini 2.5 uses per-prompt billing.

Questions

  1. Is this a known issue with how grounding requests are categorized in the rate limit dashboard?
  2. Are gemini-2.5-flash grounding requests being internally routed through a Gemini 3 model?
  3. If this is a bug, is there a workaround to ensure requests are counted under the correct model category?

Any clarification would be greatly appreciated. Thank you.

There is no known issue around this, but will have the team look into this! could you share your project number so we can better debug? Please feel free to send a direct message for privacy reasons.

@Lucia_Loher we are experiencing the exact same issue, what is the best way to remove this restriction? We are using grounding as the Gemini tool, is there any other way to avoid this rate limitation?

I still wasn’t able to reproduce and debug, @Taha_M if you can share with me more details or project number i can forward it to the team to investigate closer, we couldn’t find any issue with the system so far