Implicit Caching Not Working for Gemini-2.5-Pro with 30k+ Tokens Despite Documentation Requirements

behunkydory · August 18, 2025, 9:43am

Hello I’m experiencing an issue where implicit caching is not being applied when using gemini-2.5-pro via API. According to the documentation, implicit caching should activate when input exceeds 2,048 tokens, but I’m not seeing any caching even with prompt token counts exceeding 30k tokens.

To verify whether caching is working, I’ve run multiple tests using identical prompts, data, and structures, but prompt caching consistently fails to activate. I’m only seeing the following repeated results without any cache hits:

cache_tokens_details=None 
cached_content_token_count=None 
candidates_token_count=12841 
candidates_tokens_details=None 
prompt_token_count=30151 
prompt_tokens_details=[
    ModalityTokenCount(modality=<MediaModality.TEXT: 'TEXT'>, token_count=2545), 
    ModalityTokenCount(modality=<MediaModality.IMAGE: 'IMAGE'>, token_count=27606)
] 
thoughts_token_count=9003 
tool_use_prompt_token_count=None 
tool_use_prompt_tokens_details=None 
total_token_count=51995 
traffic_type=None

I have several questions regarding this issue:

Is this a known ongoing issue? There have been previous reports in this forum about implicit caching problems with Gemini 2.5 Pro. Has this issue been resolved, or is it still an active problem?
Modality requirements clarification: My input tokens include a significant number of Base64-encoded image files. When the documentation states that caching activates above 2,048 tokens, does this refer specifically to TEXT modality tokens, or does it include all token types? In my case, I have 2,545 text tokens and 27,606 image tokens.
Model-specific behavior: For reference, I’ve confirmed that gemini-2.5-flash does work with caching under the same settings and configuration.

This issue is causing significant cost implications for our production environment, as we’re not benefiting from the expected caching behavior with large prompts. Any clarification on whether this is a known issue or guidance on proper implementation would be greatly appreciated.

prof1993 · August 31, 2025, 2:41pm

Same problem here… Implicit caching is not working for either Pro or Flash when using Tools.

Pannaga_J · September 3, 2025, 8:17am

Hi @behunkydory
Could you please provide a minimally reproducible code example illustrating a scenario where you believe implicit caching may fail? I would like to test from my end once

Topic		Replies	Views
Implicit Caching: Gemini 2.5 Pro Preview 05-06 Gemini API context_caching , gemini_25_pro	3	442	June 25, 2025
Implicit Caching not Working on Gemini 2.5 Pro Gemini API gemini-2-5 , context_caching	3	625	June 16, 2025
Gemini 2.5 Flash implicit caching problem Gemini API api , context_caching	5	676	March 4, 2026
Has anyone gotten implicit caching to work? Gemini API gemini-3	2	45	May 5, 2026
Gemini 2.5 Flash Lite: Implicit Caching Not Working Despite Meeting Documented Requirements Gemini API bug , gemini	1	306	March 4, 2026

Implicit Caching Not Working for Gemini-2.5-Pro with 30k+ Tokens Despite Documentation Requirements

Related topics