How are “short input”, “long input”, and “cached input” token costs calculated for Gemini 2.5 Flash?

Mrudul · December 4, 2025, 5:24am

I’m using the Gemini 2.5 Flash model via the Gemini API (Google Cloud billing) and I’m trying to understand how my usage is being broken down and billed.

In the Billing → Reports view I see multiple SKUs for the same day, for example:

Generate content input token count gemini 2.5 flash short input text
Generate content output token count gemini 2.5 flash short input text
Generate content input token count gemini 2.5 flash long input text
Generate content cached input token count gemini 2.5 flash short input text
Generate content cached input token count gemini 2.5 flash long input text

I have two questions:

What is the exact threshold that decides whether a request is billed as “short input text” vs “long input text”?
- Is it based on total input tokens per request?
- If yes, what is the cutoff number of tokens for Gemini 2.5 Flash?
How are the “cached input token count” SKUs calculated?
- Under what conditions are tokens counted as cached input?
- Are cached tokens billed at a different rate, and how can I estimate that cost from my side when calling the API?

My goal is to reproduce these costs on my end (given input/output token counts per request) and to understand when my prompts will fall into “short”, “long”, and “cached” buckets.

If there’s an official doc or example that explains this mapping in detail, a link to that would be very helpful.

Thanks in advance!

Topic		Replies	Views
Billing discrepancy: detailed token usage and pricing info Gemini API gemini-flash , billing	7	459	July 17, 2025
Why is the charge different from what I calculated? Gemini API api , gemini-flash	1	115	June 25, 2025
Gemini 1.5 Pro charges x6 more tokens than expected on text prompts Gemini API gemini-15 , bug , api , gemini-api , gemini	8	300	June 10, 2024
Gemini Billing calculation Gemini API gemini-15 , api	1	248	September 29, 2024
Cost estimation for audio input and text output Gemini API gemini-15 , api	3	577	July 5, 2024

How are “short input”, “long input”, and “cached input” token costs calculated for Gemini 2.5 Flash?

Related topics