Gemini explicit context cache creation billing

m16khb · June 22, 2026, 5:21am

Hi,

I need to confirm how Vertex AI / Gemini bills caches.create for explicit context caching.

The documentation says:

For both implicit and explicit caching, you’re billed for the input tokens used to create the cache at the standard input token price. For explicit caching, there are also storage costs based on how long caches are stored.

Reference:

Does “also storage costs” mean explicit cache creation is charged as both standard input tokens and TTL storage, or is only storage charged at creation time?

Example:

Model: gemini-2.5-pro
Cached tokens: 100,000
TTL: 5 minutes
Standard input: $1.25 / 1M tokens
Cached input: $0.13 / 1M tokens
Storage: $4.50 / 1M token-hour

Interpretation A

caches.create is billed as:

standard input token cost for the cached tokens
plus storage cost for the TTL

So for 100,000 tokens and 5 minutes:

create input = 0.1 * $1.25 = $0.125
storage      = 0.1 * $4.50 * (5 / 60) = $0.0375

total cache creation/storage cost = $0.1625 before any cachedContent read

Interpretation B

caches.create is billed only as storage:

storage = 0.1 * $4.50 * (5 / 60) = $0.0375

No separate standard input token charge is applied at cache creation time.

My question is only about the cache creation step. I understand that later generateContent requests using cachedContent are billed at the cached input price for cachedContentTokenCount.

Does caches.create charge both standard input tokens and explicit cache storage, or storage only?

If Interpretation A is correct:

Is CachedContent.usageMetadata.totalTokenCount the token count used for both cache creation input billing and storage token-hour billing?
What SKU names should I expect in Cloud Billing export or Cost Table for the standard input charge and the explicit cache storage charge?

Thanks.

Topic		Replies	Views
Does explicit context cache creation support Flex PayGo pricing? Gemini API billing	0	21	May 31, 2026
This is a question about explicit caching Gemini API help_request , context_caching	4	213	January 29, 2026
Clarification on Context Cache Storage Billing (TTL vs. Actual Time) Gemini API billing	1	325	May 12, 2025
How are “short input”, “long input”, and “cached input” token costs calculated for Gemini 2.5 Flash? Gemini API api , gemini-flash-2-5	1	195	December 31, 2025
Gemini 2.5 Pro Context Cache Pricing: Per-request vs Cumulative Token Counting? Gemini API api , api-key , gemini-25	1	304	July 8, 2025

Gemini explicit context cache creation billing

Interpretation A

Interpretation B

Related topics