Does the OpenAI format support context caching?

lmk · November 27, 2024, 12:29am

Hello,

The new OpenAI API format is very convenient – thank you for adding it! Does it support context caching?

I was unable to find the answer from (a) the docs, (b) searching this forum, and (c) the internet, so I am asking here.

Thanks!

jkirstaetter · November 27, 2024, 6:11am

Yes, it’s documented here:

Ahh, sorry, that’s for the Gemini API directly, you are referring to the OpenAI API.

No, then I haven’t seen that feature yet. Perhaps it’s on the roadmap of the compatibility. Right now, I guess we have to be patient.

Susarla_Sai_Manoj · November 28, 2024, 4:53am

Hi @lmk

The OpenAI API does not provide explicit context caching. However, caching is automatically applied for prompts that are 1024 tokens or longer. If you need explicit context caching, you can consider using the Gemini API.

Thanks

lmk · December 11, 2024, 8:51pm

Apologies, I’m a bit confused. It sounds like tou’re saying that automatic context caching is automatically applied for prompts longer than 1024 tokens sent to the OpenAI-compatible API for Gemini models?

Is there any documentation about this and what is the price reduction? And is this totally separate from explicit context caching?

As far as I can see from the docs, context caching is only supported for prompts longer than 32k tokens.

Peter_Albert · May 1, 2025, 9:18am

Would love to have this working. Without context caching gemini becomes quite expensive for production usage.

Topic		Replies	Views
Does OpenAI compatible format support implicit caching Gemini API api	3	72	May 13, 2025
How to cache conversation like OpenAI/Claude/DeepSeek? Gemini API help-request	2	147	May 14, 2025
Does model.startChat cache with the prompt? Gemini API api	2	61	February 21, 2025
Does Gemini API support Model Context Prompting? Gemini API prompting , api	1	143	June 3, 2025
Implicit Caching not Working on Gemini 2.5 Pro Gemini API gemini-2-5 , context_caching	3	90	June 16, 2025

Does the OpenAI format support context caching?

Related topics