Any attempt to use cached context results in 500 Internal Server error

Emre_Coklar · December 25, 2024, 12:40am

I am unable to use Gemini’s cached context functionality. While I can successfully create caches, any attempt to use them results in a 500 Internal Server Error.

What Works

File upload
Context cache creation
Non-cached model operations

What Doesn’t Work

model = genai.GenerativeModel.from_cached_content(cache) #Works
response = model.generate_content(“Please analyze this content and identify the main topics discussed.”) #causes error 500

Timeline of Attempts

Initial Implementation

Successfully uploaded files
Successfully created cache
Failed with 500 error when trying to use cache

First Round of Changes

Added cache cleanup before creation
Added verification steps for cache existence
Added delays before cache usage
Still got 500 error

Model Variations

Tried gemini-1.5-flash-001 (cached) - 500 error
Tried gemini-1.5-flash-002 (cached) - 500 error
Tried gemini-1.5-flash-001 (non-cached) - worked

Cache Usage Attempts

# Attempt 1: Direct cache object
model = genai.GenerativeModel.from_cached_content(cached_content=cache)

# Attempt 2: Cache name
model = genai.GenerativeModel.from_cached_content(cached_content=cache.name)

Both resulted in 500 errors.

Key Code Snippets

Cache Creation (Works)

cache = caching.CachedContent.create(
    model='models/gemini-1.5-flash-001',
    display_name=batch_file.display_name,
    system_instruction='You are a JSON analysis system...',
    contents=[batch_file],
    ttl=timedelta(minutes=ttl_minutes),
)

Cache Verification (Works)

async def verify_cache_state(cache_name: str) -> bool:
    logger.info(f"Verifying cache state for: {cache_name}")
    all_caches = list_cached_contents()
    for existing_cache in all_caches:
        if hasattr(existing_cache, 'name') and existing_cache.name == cache_name:
            logger.info(f"Cache found and accessible: {cache_name}")
            return True
    return False

Relevant Error Messages

ERROR - Simple prompt test failed with cache_object: 500 An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting

ERROR - Error details: {"message": "An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting", "_errors": ["<_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.INTERNAL\n\tdetails = \"An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting\"\n\tdebug_error_string = \"UNKNOWN:Error received from peer ipv6:%5B2607:f8b0:4023:1000::5f%5D:443 {created_time:\"2024-12-25T00:21:11.2239523+00:00\", grpc_status:13, grpc_message:\"An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting\"}\"\n>"], "_details": [], "_response": "<_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.INTERNAL\n\tdetails = \"An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting\"\n\tdebug_error_string = \"UNKNOWN:Error received from peer ipv6:%5B2607:f8b0:4023:1000::5f%5D:443 {created_time:\"2024-12-25T00:21:11.2239523+00:00\", grpc_status:13, grpc_message:\"An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting\"}\"\n>", "_error_info": null}

Cache Metadata Example

{
  "name": "cachedContents/cfygud61l9cz",
  "display_name": "batch_1.json",
  "model": "models/gemini-1.5-flash-001",
  "usage_metadata": {
    "prompt_token_count": 0,
    "cached_content_token_count": 0,
    "candidates_token_count": 0,
    "total_token_count": 56446
  },
  "create_time": "2024-12-24T23:48:48.788319+00:00",
  "expire_time": "2024-12-24T23:53:48.404108+00:00"
}

Key Observations

Cache creation is successful and verifiable
Cache appears in list_cached_contents()
Cache has correct token count (56446)
Error occurs specifically when trying to use the cache with a model
Added delays (2-5 seconds) after cache creation didn’t help
Error persists across multiple cache instances

Environment Details

OS: Windows 11
Python Google AI SDK version: Latest ( import google.generativeai as genai & from google.generativeai import caching, types)
Using async/await patterns throughout
Full error logging enabled

Additional Notes

The cached content is a JSON file.

I don’t know what else to provide. Please help

Emre_Coklar · December 25, 2024, 12:53am

Right after posting this, had the genius idea to convert the JSON to TXT before caching. The issue is 100% the caching of JSON files. Generating content from the cached TXT file works just fine

Topic		Replies	Views
models.generateContent files with 500 Gemini API bug	2	78	July 23, 2024
Context Cache Creation with Pro Model Variants Gemini API ai , model	1	84	November 27, 2024
Status 500 when attempting vertex context caching Gemini API api	1	58	September 5, 2024
Create a Cache for Gemini model (Internal server error) Gemini API	3	186	July 20, 2024
Unreliability of Gemini API - Error while creating cache Gemini API bug	31	579	February 25, 2025