Question about Gemini API caching pricing

Pooja_Kapse · November 6, 2025, 5:50am

Hello, thank you for the detailed and well-formulated question.
To answer your main question: The charge of $0.125 / 1M tokens applies each time you use the cached prompt in a request. It is not a one-time fee but the discounted input rate for the cachedContentTokenCount.
Cache price not as an extra fee, but as a big discount for reusing your prompt

Your second scenario, where the charge applies for every request, is the correct interpretation.

Cached Input: The cachedContentTokenCount is billed at the discounted rate of $0.125 / 1M tokens.
Standard Input: The new tokens (promptTokenCount - cachedContentTokenCount) are billed at the standard input rate of $1.25 / 1M tokens.
Output: The total output cost is based on the sum of candidatesTokenCount and thoughtsTokenCount, billed at the standard output rate.

Therefore, your interpretation is correct. The total cost for a single API call is the sum of those three parts, while the separate $4.50 / 1M tokens/hour fee is charged for storing the cache between calls.

Topic		Replies	Views
Gemini 2.5 Pro Context Cache Pricing: Per-request vs Cumulative Token Counting? Gemini API api , api-key , gemini-25	1	254	July 8, 2025
How to count tokens when using context caching Gemini API	4	353	August 27, 2024
Important question for developer about token use in context caching pricing Gemini API	0	149	September 5, 2024
Context caching pricing explanation Gemini API gemini-api , context_caching	1	147	August 25, 2025
How are “short input”, “long input”, and “cached input” token costs calculated for Gemini 2.5 Flash? Gemini API api , gemini-flash-2-5	1	104	December 31, 2025

Question about Gemini API caching pricing

Related topics