Gemini 2.0 Cached content minimum size is too large

I am currently on 1.5-flash-002, and cached contents are working perfectly for my application. I tried upgrading to Gemini 2.0 Flash and am now getting an error notifying me that my cached content is too small. Why are we now being forced to only send cached contents of a certain size (min 4096)? I do my best to reduce the size of my inputs to allow for the best results possible (I find that larger inputs confuse the model at times) and to minimize costs. I would prefer not to have to send my system instructions with every request.

Hey @Brian_T, thanks for your feedback. Unfortunately, 2.0-flash model requires a minimum input size of 4096 tokens for explicit context caching.

Alternatively, you could consider using 2.5 models, which supports implicit context caching and and requires fewer tokens for caching.

Hi @GUNAND_MAYANGLAMBAM , thanks for your response. 2.5 flash seems to still be in preview? Is there a timeline for the official release?

It should arrive within the next few weeks.