Hello Gemini API community,
I’m experiencing significant delays when creating context caches with Gemini 2.5 Pro. Each context cache creation consistently takes multiple seconds to complete, which is becoming a bottleneck in my application workflow.
I’ve tried several approaches to optimize this process:
-
File Upload Method: Uploading files (text documents and images) and then creating context cache based on these files. While each file upload takes about 2-3 seconds, the context cache creation adds another 2-3 seconds, resulting in a total processing time of 10+ seconds.
-
System Instructions Method: Placing content directly in system instructions and creating context cache from there. However, this method also requires 10+ seconds to complete.
Both approaches seem to have similar performance characteristics, with context cache creation consistently taking multiple seconds regardless of the input method.
Questions:
- Is this expected behavior for context cache creation with Gemini 2.5 Pro?
- Are there recommended techniques to reduce the context cache creation time?
- Are there any architectural patterns (like pre-generating caches during off-peak hours) that others have successfully implemented?
- Has anyone found specific payload formats or sizes that optimize the cache creation process?
Any insights, optimization techniques, or experiences with making context cache creation more efficient would be greatly appreciated!
Thank you!