Gemini 2.5 Pro Context Cache Pricing: Per-request vs Cumulative Token Counting?

Hi everyone,

I’m looking for clarification on the Context Cache pricing structure for Gemini 2.5 Pro as outlined in the official documentation.

According to the pricing page, the Context Cache fees are:

  • $0.31 per 1M tokens for prompts ≤ 200k tokens
  • $0.625 per 1M tokens for prompts > 200k tokens
  • $4.50 per 1M tokens per hour (storage price)

My question is about how the 200k token threshold is calculated:

Is the token count calculated per individual API request, or is it cumulative across all requests under the same API key?

For example, if I make multiple requests:

  • Request 1: 150k tokens
  • Request 2: 180k tokens
  • Request 3: 250k tokens

Would Request 3 be charged at the higher rate ($0.625) because it individually exceeds 200k tokens, or would Request 2 also be charged at the higher rate because the cumulative total (150k + 180k = 330k) exceeds 200k?

I want to make sure I understand the billing logic correctly for planning our usage and costs.

Any clarification from the community or Google team would be greatly appreciated!

Thanks in advance!

It’s per API request (which includes each message sent as part of a chat conversation flow)

1 Like