Does Batch API in Vertex AI support caching?

I am using the following format

{'cachedContent': 'projects/*******/locations/us-central1/cachedContents/******',
 'contents': [{'parts': [{'text': '.'}], 'role': 'user'}],
 'generationConfig': {'candidateCount': 1,
  'maxOutputTokens': 65534,
  'temperature': 0,
  'topP': 0.95},
 'safetySettings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT',
   'threshold': 'OFF'},
  {'category': 'HARM_CATEGORY_HATE_SPEECH', 'threshold': 'OFF'},
  {'category': 'HARM_CATEGORY_HARASSMENT', 'threshold': 'OFF'},
  {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'threshold': 'OFF'}]}

But I get this error

Bad Request: {"error": {"code": 400, "message": "Model gemini-2.5-flash-001 does not support cached content with batch prediction.", "status": "INVALID_ARGUMENT"}}

Hi @Shreyansh_Bardia,

Batch prediction does not support explicit caching. Batch prediction is optimized for high-throughput, asynchronous processing of a large number of prompts at a reduced cost, reference (refer to “Why use batch prediction?”); while explicit context caching is designed for real-time or near-real-time scenarios where you want to manually manage and reuse a large context across multiple individual requests to reduce latency and cost.

Okay, but It would be great if we could include caching in Batch Prediction as well, we have a prompt of around 10k tokens which we would like to run with different inputs. Caching would help in such scenarios

Sure @Shreyansh_Bardia,

I will raise this as a feature request to the concerned team. Thanks for providing your use-case as an example. If you would like to elaborate on this example based on your use-case and cost saving estimates with explicit context caching with batch, it will help the team.