Context caching - batch api requests

Does context caching work on batch api requests? I have it working perfectly in my online requests.
However, when I try to do the same, using a jsonl file uploaded to gcs, gemini is not using the context I provide, am I doing something wrong? Or is it just not enabled yet?

I add the cache to the config like this

    jsonl_data = []
    for prompt, cache in zip(list_prompts, list_cache):
        config.gemini_config.cached_content = cache.name if cache else None
        jsonl_entry = {
            "key": str(uuid.uuid4()),
            "request": {
                "contents": [
                    {
                        "role": "user",
                        "parts": [{"text": prompt}],
                    }
                ],
                "generationConfig": config.gemini_config.to_json_dict(),
                # "system_instruction": {"parts": [{"text": config.system_instruction}]},
            },
        }
        jsonl_data.append(jsonl_entry)

Being the config this kind of object:

        self.gemini_config = GenerateContentConfig(
            temperature=self.temperature,
            max_output_tokens=8000,
            thinking_config=ThinkingConfig(
                include_thoughts=False,
                thinking_budget=0
            ),
        )

Thanks a lot

Hi @Kailegh,

Correcting the JSONL Structure

For batch jobs, each line in your JSONL file represents a single request. The cachedContent must be a peer to the contents and generationConfig fields.


{
    "key": "your-unique-key",
    "request": {
        "contents": [
            {
                "role": "user",
                "parts": [
                    {
                        "text": "Your prompt goes here."
                    }
                ]
            }
        ],
        "generationConfig": {
            "temperature": 0.7,
            "max_output_tokens": 8000
        },
        "cachedContent": "cachedContents/your-cache-name"
    }
}

Revised Python Code

To fix this, you’ll need to adjust how you construct the request dictionary in your Python script. The change is to add the cachedContent key directly to the request_data dictionary.


import uuid
import json

# Assuming 'list_prompts', 'list_cache', and 'config' are defined
jsonl_data = []
for prompt, cache in zip(list_prompts, list_cache):
    request_data = {
        "contents": [
            {
                "role": "user",
                "parts": [{"text": prompt}],
            }
        ],
        "generationConfig": config.gemini_config.to_json_dict(),
    }
    
    # Add the cachedContent field at the top level of the request if a cache exists
    if cache:
        request_data["cachedContent"] = cache.name

    jsonl_entry = {
        "key": str(uuid.uuid4()),
        "request": request_data,
    }
    
    jsonl_data.append(jsonl_entry)

# To create the JSONL file content:
# jsonl_content = "\n".join(json.dumps(entry) for entry in jsonl_data)
# print(jsonl_content)

By making this change, your batch job should run correctly with cached context for each request.

1 Like

Thank you for your answer, but are you sure that works?
I am sending requests with jsonl files like this:

{“key”: “582a4f0b-b303-41c1-80af-7ae89648dcdc”, “request”: {“contents”: [{“role”: “user”, “parts”: [{“text”: "\n \n \n<conversation id="25456_687240977_2020-09-25 13-43" number="1">{REAL TEXTS HERE}\n\n JSON Response:\n "}]}], “generationConfig”: {“temperature”: 0.0, “max_output_tokens”: 8000, “thinking_config”: {“include_thoughts”: false, “thinking_budget”: 0}}, “cachedContent”: “projects/334942433169/locations/europe-west4/cachedContents/5751646479966011392”}}

And I am still getting no proper aswer (like the one I get with online), that cache content is valid and tested with online mode (no batch)

Hi @Krish_Varnakavi1

I am trying the format you have mentioned above but it’s still failing. Can you help me with this please

{'cachedContent': 'projects/******/locations/us-central1/cachedContents/*******',
 'contents': [{'parts': [{'text': '.'}], 'role': 'user'}],
 'generationConfig': {'candidateCount': 1,
  'maxOutputTokens': 65534,
  'temperature': 0,
  'topP': 0.95},
 'safetySettings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT',
   'threshold': 'OFF'},
  {'category': 'HARM_CATEGORY_HATE_SPEECH', 'threshold': 'OFF'},
  {'category': 'HARM_CATEGORY_HARASSMENT', 'threshold': 'OFF'},
  {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'threshold': 'OFF'}]}

Bad Request: {“error”: {“code”: 400, “message”: “Model gemini-2.5-flash-001 does not support cached content with batch prediction.”, “status”: “INVALID_ARGUMENT”}}

Hello

Welcome to the forum!!

Context Caching is not currently supported with the Batch API. You can refer to the Context caching overview and limitation for more details on these features.