Hi everyone,
I’m looking for clarification on the Context Cache pricing structure for Gemini 2.5 Pro as outlined in the official documentation.
According to the pricing page, the Context Cache fees are:
- $0.31 per 1M tokens for prompts ≤ 200k tokens
- $0.625 per 1M tokens for prompts > 200k tokens
- $4.50 per 1M tokens per hour (storage price)
My question is about how the 200k token threshold is calculated:
Is the token count calculated per individual API request, or is it cumulative across all requests under the same API key?
For example, if I make multiple requests:
- Request 1: 150k tokens
- Request 2: 180k tokens
- Request 3: 250k tokens
Would Request 3 be charged at the higher rate ($0.625) because it individually exceeds 200k tokens, or would Request 2 also be charged at the higher rate because the cumulative total (150k + 180k = 330k) exceeds 200k?
I want to make sure I understand the billing logic correctly for planning our usage and costs.
Any clarification from the community or Google team would be greatly appreciated!
Thanks in advance!