I’m trying to use client.caches.create() in the google-genai Python SDK to cache a large text-only prompt (~2500 tokens). The goal is to optimize cost and performance by avoiding resending this static context repeatedly. Here’s what I do:
from google import genai
from google.genai.types import CreateCachedContentConfig
client = genai.Client(
vertexai=True,
project=credentials.project_id,
location='global',
http_options=HttpOptions(api_version="v1beta1")
cache = client.caches.create(
model="models/gemini-2.0-pro-001",
config=CreateCachedContentConfig(
display_name="aime template test",
system_instruction="Text-only context for moderation",
contents=[Part.from_text("Some instructional prompt text")],
ttl="300s"
)
)
But I consistently get this error:
500 INTERNAL – ResourceCategoryConfig for RESOURCE_CATEGORY_GENAI_CACHE is not found
The issue is that the model you are using doesn’t support context caching. Try using gemini-2.5-pro-preview-05-06 or gemini-2.0-flash-001. You can find a list of models that support context caching in this documentation link.
ServerError: 500 INTERNAL. {'error': {'code': 500, 'message': 'Pipeline 7669911640388665344 failed with error: Unhandled exception from task handler: ResourceCategoryConfig for RESOURCE_CATEGORY_GENAI_CACHE is not found.\n\tcom.google.common.base.Preconditions.checkState(Preconditions.java:657)\n\tcom.google.cloud.ai.platform.boq.shared.configuration.AiPlatformConfigHelper.getResourceCategoryConfig(AiPlatformConfigHelper.java:279)\n\tcom.google.cloud.ai.platform.boq.shared.tasks.provision.ProvisionProjectTaskHandler.createChildTasksAndFinalizers(ProvisionProjectTaskHandler.java:792)\n.', 'status': 'INTERNAL'}}
500 internal server error usually indicates an unexpected error that occurred on Google’s side. It’s often a temporary issue.
As a workaround, you can temporarily switch to another model (and see if that works), or wait a bit and retry your request
The error you’re seeing (“ResourceCategoryConfig for RESOURCE_CATEGORY_GENAI_CACHE is not found”) is related to the Vertex AI service configuration, not your code implementation. The caching feature needs to be enabled through the Google Cloud Console as mentioned in my previous message.