Gemini 2.5 Pro Context Cache Pricing: Per-request vs Cumulative Token Counting?

Ethan_G · July 8, 2025, 9:27am

Hi everyone,

I’m looking for clarification on the Context Cache pricing structure for Gemini 2.5 Pro as outlined in the official documentation.

According to the pricing page, the Context Cache fees are:

$0.31 per 1M tokens for prompts ≤ 200k tokens
$0.625 per 1M tokens for prompts > 200k tokens
$4.50 per 1M tokens per hour (storage price)

My question is about how the 200k token threshold is calculated:

Is the token count calculated per individual API request, or is it cumulative across all requests under the same API key?

For example, if I make multiple requests:

Request 1: 150k tokens
Request 2: 180k tokens
Request 3: 250k tokens

Would Request 3 be charged at the higher rate ($0.625) because it individually exceeds 200k tokens, or would Request 2 also be charged at the higher rate because the cumulative total (150k + 180k = 330k) exceeds 200k?

I want to make sure I understand the billing logic correctly for planning our usage and costs.

Any clarification from the community or Google team would be greatly appreciated!

Thanks in advance!

Richard_Davey · July 8, 2025, 11:04am

It’s per API request (which includes each message sent as part of a chat conversation flow)

Topic		Replies	Views
Question about Gemini API caching pricing Gemini API api , billing	1	205	November 6, 2025
Important question for developer about token use in context caching pricing Gemini API	0	145	September 5, 2024
How to count tokens when using context caching Gemini API	4	315	August 27, 2024
Gemini 2.5 Pro API Pricing question Gemini API billing , gemini-2-5	1	309	May 27, 2025
This is a question about explicit caching Gemini API help_request , context_caching	4	108	January 29, 2026

Gemini 2.5 Pro Context Cache Pricing: Per-request vs Cumulative Token Counting?

Related topics