Slow Context Cache Creation with Gemini 2.5 Pro: Looking for Optimization Methods

Hello Gemini API community,

I’m experiencing significant delays when creating context caches with Gemini 2.5 Pro. Each context cache creation consistently takes multiple seconds to complete, which is becoming a bottleneck in my application workflow.

I’ve tried several approaches to optimize this process:

  1. File Upload Method: Uploading files (text documents and images) and then creating context cache based on these files. While each file upload takes about 2-3 seconds, the context cache creation adds another 2-3 seconds, resulting in a total processing time of 10+ seconds.

  2. System Instructions Method: Placing content directly in system instructions and creating context cache from there. However, this method also requires 10+ seconds to complete.

Both approaches seem to have similar performance characteristics, with context cache creation consistently taking multiple seconds regardless of the input method.

Questions:

  • Is this expected behavior for context cache creation with Gemini 2.5 Pro?
  • Are there recommended techniques to reduce the context cache creation time?
  • Are there any architectural patterns (like pre-generating caches during off-peak hours) that others have successfully implemented?
  • Has anyone found specific payload formats or sizes that optimize the cache creation process?

Any insights, optimization techniques, or experiences with making context cache creation more efficient would be greatly appreciated!

Thank you!

Hi @Ethan_G ,
Yes, current context cache creation with Gemini 2.5 Pro typically takes a few seconds, this is expected due to backend processing. To optimize, some users pre-generate caches during off-peak hours or batch content strategically. No known payload format drastically speeds it up yet.
Thanks!

Hi @Ethan_G , Welcome to the forum.

I just tested this with gemini-2.5-pro and context cache creation takes around 3–4 seconds on my end. Definitely not hitting 10+ seconds. How large are your files? Are you uploading multiple at once or using big payloads? Might help to share more details so we can try to reproduce it.

Thanks