Any attempt to use cached context results in 500 Internal Server error

I am unable to use Gemini’s cached context functionality. While I can successfully create caches, any attempt to use them results in a 500 Internal Server Error.

What Works

  1. File upload
  2. Context cache creation
  3. Non-cached model operations

What Doesn’t Work

model = genai.GenerativeModel.from_cached_content(cache) #Works
response = model.generate_content(“Please analyze this content and identify the main topics discussed.”) #causes error 500

Timeline of Attempts

  1. Initial Implementation
  • Successfully uploaded files
  • Successfully created cache
  • Failed with 500 error when trying to use cache
  1. First Round of Changes
  • Added cache cleanup before creation
  • Added verification steps for cache existence
  • Added delays before cache usage
  • Still got 500 error
  1. Model Variations
  • Tried gemini-1.5-flash-001 (cached) - 500 error
  • Tried gemini-1.5-flash-002 (cached) - 500 error
  • Tried gemini-1.5-flash-001 (non-cached) - worked
  1. Cache Usage Attempts
# Attempt 1: Direct cache object
model = genai.GenerativeModel.from_cached_content(cached_content=cache)

# Attempt 2: Cache name
model = genai.GenerativeModel.from_cached_content(cached_content=cache.name)

Both resulted in 500 errors.

Key Code Snippets

  1. Cache Creation (Works)
cache = caching.CachedContent.create(
    model='models/gemini-1.5-flash-001',
    display_name=batch_file.display_name,
    system_instruction='You are a JSON analysis system...',
    contents=[batch_file],
    ttl=timedelta(minutes=ttl_minutes),
)
  1. Cache Verification (Works)
async def verify_cache_state(cache_name: str) -> bool:
    logger.info(f"Verifying cache state for: {cache_name}")
    all_caches = list_cached_contents()
    for existing_cache in all_caches:
        if hasattr(existing_cache, 'name') and existing_cache.name == cache_name:
            logger.info(f"Cache found and accessible: {cache_name}")
            return True
    return False

Relevant Error Messages

ERROR - Simple prompt test failed with cache_object: 500 An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting
ERROR - Error details: {"message": "An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting", "_errors": ["<_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.INTERNAL\n\tdetails = \"An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting\"\n\tdebug_error_string = \"UNKNOWN:Error received from peer ipv6:%5B2607:f8b0:4023:1000::5f%5D:443 {created_time:\"2024-12-25T00:21:11.2239523+00:00\", grpc_status:13, grpc_message:\"An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting\"}\"\n>"], "_details": [], "_response": "<_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.INTERNAL\n\tdetails = \"An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting\"\n\tdebug_error_string = \"UNKNOWN:Error received from peer ipv6:%5B2607:f8b0:4023:1000::5f%5D:443 {created_time:\"2024-12-25T00:21:11.2239523+00:00\", grpc_status:13, grpc_message:\"An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting\"}\"\n>", "_error_info": null}

Cache Metadata Example

{
  "name": "cachedContents/cfygud61l9cz",
  "display_name": "batch_1.json",
  "model": "models/gemini-1.5-flash-001",
  "usage_metadata": {
    "prompt_token_count": 0,
    "cached_content_token_count": 0,
    "candidates_token_count": 0,
    "total_token_count": 56446
  },
  "create_time": "2024-12-24T23:48:48.788319+00:00",
  "expire_time": "2024-12-24T23:53:48.404108+00:00"
}

Key Observations

  1. Cache creation is successful and verifiable
  2. Cache appears in list_cached_contents()
  3. Cache has correct token count (56446)
  4. Error occurs specifically when trying to use the cache with a model
  5. Added delays (2-5 seconds) after cache creation didn’t help
  6. Error persists across multiple cache instances

Environment Details

  • OS: Windows 11
  • Python Google AI SDK version: Latest ( import google.generativeai as genai & from google.generativeai import caching, types)
  • Using async/await patterns throughout
  • Full error logging enabled

Additional Notes

  1. The cached content is a JSON file.

I don’t know what else to provide. Please help

Right after posting this, had the genius idea to convert the JSON to TXT before caching. The issue is 100% the caching of JSON files. Generating content from the cached TXT file works just fine