Hey,
I am seeking clarity on the interaction between explicit and implicit caching in the Gemini API, specifically regarding the cachedContentTokenCount field in usage metadata.
In my tests:
When explicit coaching (context cache) is enabled, the cachedContentTokenCount reports only the explicitly cached tokens.
When explicit coaching is disabled, the field reflects tokens hit via implicit caching, as expected.
However, this means I can’t see implicit cache benefits when explicit coaching is turned on, even for repeated prompt prefixes. Is this the intended behavior, or is it a bug or limitation of the API? Should implicit caching be combined with explicit caching (e.g., on requests mixing prefix cache and user content)?
I’d appreciate clarification or a pointer to relevant documentation. Has anyone else seen this or have official guidance for the expected API behavior in this scenario?