Context caching blocked on Paid Tier 1 — max_total_token_count=0

Hi Gemini API Team,

I’m hitting a provisioning issue where context caching is completely disabled for my project despite being on an active Paid Tier 1 account with billing confirmed.

The error:

400 INVALID_ARGUMENT: Cached content is too large. total_token_count=1328389, max_total_token_count=0

Key details:

  • Account is Paid Tier 1 with active billing

  • max_total_token_count reports as 0 for all models

  • Caching works for smaller content (1-2 chunks) but fails at 3+ chunks (~1.3M tokens)

  • Model: gemini-2.5-pro

This appears to be the same provisioning sync issue reported in other threads where the cache quota is hardcoded to 0 despite an active paid account.

Can someone check why the cache quota is set to 0 for this project and force a resync?

Thank you

Same. What gives? No response?

Hello @Tommy_Roldan & @PLAYi,
The error you mentioned (400 INVALID_ARGUMENT: Cached content is too large. total_token_count=1328389, max_total_token_count=0) occurs because the implicit cached content you are trying to create (1,328,389 tokens) exceeds the gemini-2.5-pro model’s input token limit of 1,048,576. As noted in the documentation, the maximum size for cached content is the same as the corresponding model’s input token limit. The fact that you are able to successfully create caches for smaller chunks confirms that caching is indeed enabled on your Tier 1 project.