Documentation states that
Minimum cache token count for implicit and explicit caching
-
Gemini 3 and Gemini 3.1 models: 4,096 tokens
-
Gemini 2.0 and 2.5 models: 2,048 tokens
https://docs.cloud.google.com/vertex-ai/generative-ai/docs/context-cache/context-cache-overview#limits
But in reality i was able to create for 3.1 model with 1024 tokens:
cache = await client.aio.caches.create(
model="gemini-3.1-pro-preview",
config=types.CreateCachedContentConfig(
contents=[
types.Content(
role="user",
parts=[types.Part(text=text)],
)
],
display_name="vertex-cached-prompt",
ttl="120s",
),
)
Error i got while on attempt to create cache for 100 tokens:
google.genai.errors.ClientError: 400 INVALID_ARGUMENT. {‘error’: {‘code’: 400, ‘message’: ‘The cached content is of 151 tokens. The minimum token count to start caching is 1024.’, ‘status’: ‘INVALID_ARGUMENT’}}
Can you clarify this please?