I am unable to use Gemini’s cached context functionality. While I can successfully create caches, any attempt to use them results in a 500 Internal Server Error.
What Works
- File upload
- Context cache creation
- Non-cached model operations
What Doesn’t Work
model = genai.GenerativeModel.from_cached_content(cache) #Works
response = model.generate_content(“Please analyze this content and identify the main topics discussed.”) #causes error 500
Timeline of Attempts
- Initial Implementation
- Successfully uploaded files
- Successfully created cache
- Failed with 500 error when trying to use cache
- First Round of Changes
- Added cache cleanup before creation
- Added verification steps for cache existence
- Added delays before cache usage
- Still got 500 error
- Model Variations
- Tried
gemini-1.5-flash-001
(cached) - 500 error - Tried
gemini-1.5-flash-002
(cached) - 500 error - Tried
gemini-1.5-flash-001
(non-cached) - worked
- Cache Usage Attempts
# Attempt 1: Direct cache object
model = genai.GenerativeModel.from_cached_content(cached_content=cache)
# Attempt 2: Cache name
model = genai.GenerativeModel.from_cached_content(cached_content=cache.name)
Both resulted in 500 errors.
Key Code Snippets
- Cache Creation (Works)
cache = caching.CachedContent.create(
model='models/gemini-1.5-flash-001',
display_name=batch_file.display_name,
system_instruction='You are a JSON analysis system...',
contents=[batch_file],
ttl=timedelta(minutes=ttl_minutes),
)
- Cache Verification (Works)
async def verify_cache_state(cache_name: str) -> bool:
logger.info(f"Verifying cache state for: {cache_name}")
all_caches = list_cached_contents()
for existing_cache in all_caches:
if hasattr(existing_cache, 'name') and existing_cache.name == cache_name:
logger.info(f"Cache found and accessible: {cache_name}")
return True
return False
Relevant Error Messages
ERROR - Simple prompt test failed with cache_object: 500 An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting
ERROR - Error details: {"message": "An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting", "_errors": ["<_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.INTERNAL\n\tdetails = \"An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting\"\n\tdebug_error_string = \"UNKNOWN:Error received from peer ipv6:%5B2607:f8b0:4023:1000::5f%5D:443 {created_time:\"2024-12-25T00:21:11.2239523+00:00\", grpc_status:13, grpc_message:\"An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting\"}\"\n>"], "_details": [], "_response": "<_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.INTERNAL\n\tdetails = \"An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting\"\n\tdebug_error_string = \"UNKNOWN:Error received from peer ipv6:%5B2607:f8b0:4023:1000::5f%5D:443 {created_time:\"2024-12-25T00:21:11.2239523+00:00\", grpc_status:13, grpc_message:\"An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting\"}\"\n>", "_error_info": null}
Cache Metadata Example
{
"name": "cachedContents/cfygud61l9cz",
"display_name": "batch_1.json",
"model": "models/gemini-1.5-flash-001",
"usage_metadata": {
"prompt_token_count": 0,
"cached_content_token_count": 0,
"candidates_token_count": 0,
"total_token_count": 56446
},
"create_time": "2024-12-24T23:48:48.788319+00:00",
"expire_time": "2024-12-24T23:53:48.404108+00:00"
}
Key Observations
- Cache creation is successful and verifiable
- Cache appears in list_cached_contents()
- Cache has correct token count (56446)
- Error occurs specifically when trying to use the cache with a model
- Added delays (2-5 seconds) after cache creation didn’t help
- Error persists across multiple cache instances
Environment Details
- OS: Windows 11
- Python Google AI SDK version: Latest ( import google.generativeai as genai & from google.generativeai import caching, types)
- Using async/await patterns throughout
- Full error logging enabled
Additional Notes
- The cached content is a JSON file.
I don’t know what else to provide. Please help