Batch API Gemini 2.5 Flash charges are abnormally excessive in AI Studio

Payment Account Number: 01B180-C4A819-FE4D1A

[Before, Aug~Oct]

Type: Batch API

Model: Gemini 2.5 Flash(Text)

Requests: 200,000~

(SKU) BatchGenerate content output token = 635,099,452 count

Billing: 758 $ ~

TIER: 1

[Current, Nov]

Type: Batch API → SAME WITH ABOVE

Model: Gemini 2.5 Flash(Text) → SAME WITH ABOVE

Requests: ONLY 2,000 → SAME INPUT PROMPT WITH ABOVE

(SKU) BatchGenerate content output token = 597,735,124 count

Billing: 750 $ ~

TIER: 3

The only changes are the TIER and the number of requests. Everything else is completely identical.
Compared to the previous setup, we requested only 1% of the Batch API calls, yet the billing came out similarly high.
This seems to be due to the BatchGenerate content output token being abnormally high.
Since we entered the same prompt, we cannot assume we made requests with long outputs, and no one else is using the API key.
I have registered a billing support case. Please explain why this result occurred.

1 Like

Hi @user3391

Thank you for reaching out to us.
Could you please connect with the Google Cloud Billing Support team once?

1 Like

i am processing about it. but it’s hard to say what was wrong. do i need to reply a number of billing support case to you?

1 Like

Hi @user3391, could you find a solution? If yes, what was your solution?

This is typical GCP billing practices. Make sure that you set your thinking_level to low or medium because it’s defaulting to high.

Hello, I haven’t found a solution yet, but it seems like the output contains an extremely large number of tokens. The prompt I provided to Gemini was slightly different, and I think the result is that the output listed all the URLs from the search. As another user mentioned, it wasn’t a problem involving excessive “thinking” or anything like that.