Implicit Context Caching stops working when a thinking budget is set – metadata.cached_token becomes None

user2302 · July 22, 2025, 3:40am

Hi

I’m seeing unexpected behavior when combining Gemini’s implicit context caching with the thinking budget parameter and would like to confirm whether this is intentional or a bug.

response = self.client.models.generate_content_stream(
model=self.model,
contents=conv,
config=types.GenerateContentConfig(
# cached_content=cache_info[“cache_id”] if cache_info else None,
temperature=request.temperature or self.temperature,
max_output_tokens=(request.max_tokens) if request.max_tokens else self.max_tokens,
thinking_config=types.ThinkingConfig(include_thoughts=True, thinking_budget=thinking_budget) if thinking_budget else None,
**self.default_params,
)
)

Mrinal_Ghosh · July 22, 2025, 9:58am

Hi @user2302 ,

Welcome to the Forum.
Could you please let us know which Gemini model you are using?

Topic		Replies	Views
cachedContentTokenCount only showing explicit cached tokens Gemini API bug , gemini-api	0	121	August 23, 2025
Implicit Caching: Gemini 2.5 Pro Preview 05-06 Gemini API context_caching , gemini_25_pro	3	439	June 25, 2025
Implicit Caching not Working on Gemini 2.5 Pro Gemini API gemini-2-5 , context_caching	3	622	June 16, 2025
Have anyone checked out the implicit caching for gemini api, caches hits are inconsistent for me Gemini API gemini-api , gemini-2-5	7	625	June 13, 2025
Gemini-2.5-flash-preview-09-2025 breaks the thinking_budget parameter Gemini API bug , gemini-flash-2-5	3	492	October 21, 2025

Implicit Context Caching stops working when a thinking budget is set – metadata.cached_token becomes None

Related topics