Does Batch API in Vertex AI support caching?

Shreyansh_Bardia · October 10, 2025, 5:48pm

I am using the following format

{'cachedContent': 'projects/*******/locations/us-central1/cachedContents/******',
 'contents': [{'parts': [{'text': '.'}], 'role': 'user'}],
 'generationConfig': {'candidateCount': 1,
  'maxOutputTokens': 65534,
  'temperature': 0,
  'topP': 0.95},
 'safetySettings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT',
   'threshold': 'OFF'},
  {'category': 'HARM_CATEGORY_HATE_SPEECH', 'threshold': 'OFF'},
  {'category': 'HARM_CATEGORY_HARASSMENT', 'threshold': 'OFF'},
  {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'threshold': 'OFF'}]}

But I get this error

Bad Request: {"error": {"code": 400, "message": "Model gemini-2.5-flash-001 does not support cached content with batch prediction.", "status": "INVALID_ARGUMENT"}}

Krish_Varnakavi1 · October 10, 2025, 9:50pm

Hi @Shreyansh_Bardia,

Batch prediction does not support explicit caching. Batch prediction is optimized for high-throughput, asynchronous processing of a large number of prompts at a reduced cost, reference (refer to “Why use batch prediction?”); while explicit context caching is designed for real-time or near-real-time scenarios where you want to manually manage and reuse a large context across multiple individual requests to reduce latency and cost.

Shreyansh_Bardia · October 11, 2025, 5:49am

Okay, but It would be great if we could include caching in Batch Prediction as well, we have a prompt of around 10k tokens which we would like to run with different inputs. Caching would help in such scenarios

Krish_Varnakavi1 · November 5, 2025, 11:17pm

Sure @Shreyansh_Bardia,

I will raise this as a feature request to the concerned team. Thanks for providing your use-case as an example. If you would like to elaborate on this example based on your use-case and cost saving estimates with explicit context caching with batch, it will help the team.

Topic		Replies	Views
Is context caching with batch API not supported? Gemini API vertexai , context_caching	4	276	August 7, 2025
Context caching - batch api requests Gemini API api , prompt	4	201	November 25, 2025
How do I send cached content to a batch job?! Gemini API api	1	223	June 17, 2025
Does Batch API ( Gemini 2.5 Flash ) support Context Catching? Gemini API api , context_caching	3	104	November 10, 2025
Gemini-2.5-flash-image-preview don't support battch api Gemini API api , gemini-api , gemini-flash-2-5	5	497	October 2, 2025

Does Batch API in Vertex AI support caching?

Related topics