As all of you know, cached input tokens are wayyy cheaper than non cached.
I’m building an app where having a video or two uploaded for context greatly increases the accuracy of my outputs but I want to cache that to save on all the video understanding input tokens. Files API has a TTL of 48 hours. Other options are GCS bucket (seems hard to integrate) or public URL (like youtube). Unsure if public URL will actually cache those tokens or not? How does that “auto” logic work?
While implicit caching (i.e. automatic caching) is best effort and not guaranteed, with explicit caching you can create a cached object, set the TTL, and re-use it multiple times.
Still doesnt really answer my question though I appreciate the response!
Unless I’m reading things wrong - Files API still has a max TTL of 48 hours. so still question remains - does context that comes in via public URL actually get cached / can it be cached? none of that is specified in the link you attached.