Does the OpenAI format support context caching?

Hello,

The new OpenAI API format is very convenient – thank you for adding it! Does it support context caching?

I was unable to find the answer from (a) the docs, (b) searching this forum, and (c) the internet, so I am asking here.

Thanks!

Yes, it’s documented here:

Ahh, sorry, that’s for the Gemini API directly, you are referring to the OpenAI API.

No, then I haven’t seen that feature yet. Perhaps it’s on the roadmap of the compatibility. Right now, I guess we have to be patient.

1 Like

Hi @lmk

The OpenAI API does not provide explicit context caching. However, caching is automatically applied for prompts that are 1024 tokens or longer. If you need explicit context caching, you can consider using the Gemini API.

Thanks

Apologies, I’m a bit confused. It sounds like tou’re saying that automatic context caching is automatically applied for prompts longer than 1024 tokens sent to the OpenAI-compatible API for Gemini models?

Is there any documentation about this and what is the price reduction? And is this totally separate from explicit context caching?

As far as I can see from the docs, context caching is only supported for prompts longer than 32k tokens.