Thinking Tokens Counted, but Billed as Non-Thinking

papyrus · April 23, 2025, 10:55am

Hi there.

According to the following documentation, the Gemini 2.5 series automatically distinguishes between Thinking and Non-Thinking based on the input:

Gemini thinking | Gemini API | Google AI for Developers

Models with thinking capabilities are available in Google AI Studio and through the Gemini API. Thinking is on by default in both the API and AI Studio because the 2.5 series models have the ability to automatically decide when and how much to think based on the prompt.

Also, when I use Gemini 2.5 Flash(gemini-2.5-flash-preview-04-17) and check the usage_metadata.thoughts_token_count in the response, I can see values ranging from around 2,000 to 5,000(When I set the thinking_budget to 0, I checked that the value drops to 0 and the model becomes noticeably dumber)

However, in the Google Cloud billing report, all outputs are listed as “Generate content output token count gemini 2.5 flash short output text non-thinking”.

Why is this happening? Has it just not been billed as “thinking” yet, even though it’s been three days?

sot

from google import genai
from google.genai import types

class Test:
    def __init__(self, gemini_api_key: str):
        self.API_KEY = gemini_api_key
        self.client = genai.Client(api_key=gemini_api_key)

    def talk(self, message):
        r = self.client.models.generate_content(
            model="gemini-2.5-flash-preview-04-17",
            contents=message,
            config=types.GenerateContentConfig(
                thinking_config=types.ThinkingConfig(thinking_budget=10000)
            ),
        )

papyrus · April 24, 2025, 8:25am

Solved: “Generate content output token count gemini 2.5 flash short input text” was added to Billing, quite late.

Topic		Replies	Views
Billing discrepancy: detailed token usage and pricing info Gemini API gemini-flash , billing	7	225	July 17, 2025
Do thinkingBudget tokens count toward billed output in Gemini 2.5 Flash? Gemini API api , models , billing	1	32	July 11, 2025
Pricing for Gemini 2.5 API: With and Without Thinking Option in the Official Release Gemini API billing , thinking , gemini-2-5	5	77	July 18, 2025
Gemini-2.5-flash-preview-04-17 not honoring thinking_budget=0 Gemini API help_request	5	1256	April 22, 2025
Gemini 2.5 Flash Overthinking by a lot Gemini API prompt , gemini-2	5	148	July 28, 2025

Thinking Tokens Counted, but Billed as Non-Thinking

Related topics