Hi
I have a question regarding explicit caching. Is explicit caching doing KV caching or something else. If it does KV caching, then can we not just leverage implicit caching? What is the advantage of explicit caching if implicit already does it .
Hi
I have a question regarding explicit caching. Is explicit caching doing KV caching or something else. If it does KV caching, then can we not just leverage implicit caching? What is the advantage of explicit caching if implicit already does it .
Hi @Rajib_Deb, welcome to the forum!
Yes, both mechanisms rely on KV caching. With Explicit Caching, you pass content to the model once, cache the input tokens, and then refer to that cached context for subsequent requests.
This gives you control over persistence. It becomes cost-effective at certain volumes because reusing cached tokens is cheaper than passing the same full corpus repeatedly.
In contrast, Implicit Caching is enabled by default (often with a minimum token limit, depending on the model) but offers no guarantees that the cache will persist between requests.
Please refer here for complete details.
Thank you!