URGENT: Huge cost cache increase issue

Liz2k · March 18, 2026, 5:21am

SKU: E181-DFF8-56CF
I have huge problems with cache calculation.

For test issue I created 5 cached contents with 5 seconds time to live. All that caches was invalidated/expired after 5 seconds (which is fine).

Cache check at 17 march 2026:

curl -s “https://generativelanguage.googleapis.com/v1beta/cachedContents?key={MY_KEYS}”

output:

{} - meaning No registered keys whole day.

And right now I see HUGE NUMBERS: 3,418,811.34 hours for $15.38 (in billing 17 march 2026). I’m 1000% sure I didn’t use cache anymore.

How to solve this and return my money?

For 16th march I got same issue with cache that cost me 35$ for nothing.

junkx · March 18, 2026, 1:16pm

I don’t know what is happening with the API pricing the last 2 days but something is really wrong. my costs went x4 over night

Liz2k · March 18, 2026, 3:07pm

My cache cost still increasing. Looks like it is stuck on server side.

I have removed all keys.
Zero API calls
Still NO cache content in v1beta/cachedContents?key={MY_KEYS}, So I can’t delete ether.

So, how I can stop this background activity?
Or how I can get tech support?
This is not fun to have 36$ per day for nothing….

Liz2k · March 19, 2026, 9:06am

My findings:

When I first noticed this billing anomaly, the very first thing I did was use the Gemini API to FIND and DELETE all existing caches. For the last 3 days, there have been NO caches in the List whatsoever. Whenever I queried the cache list via the API, the result was completely empty. Yet, the charges continued to accumulate for the last 3 days.
Gemini Cache API showed me an empty list, leaving me completely powerless to stop it while it consumed my funds.
It is getting from me daily $ 36 per day, I assume now this is infinite zombie/phantom cache.
Yesterday, as a last resort, I completely disabled the Gemini API in my Cloud Console, hoping this would force-kill the phantom resource. I will wait a few days to account for the 32-hour propagation delay you mentioned, and then reactivate the API to see if the phantom returns.

Jonathon_Habersberge · March 19, 2026, 11:47am

My project has had the same issue. No increase in token usage but 20x in prices, unfortunately destroying the economics of my business. Hope this is an error not an unannounced increase

Liz2k · March 19, 2026, 12:27pm

Check in Google Cloud Reports in your project.
Group by SKU.
And see anomaly.

This is example of mine (blue) anomaly. May be you will get similar.

junkx · March 19, 2026, 1:26pm

Here is my chart. Even if its not related to your cache problem, for the past few days, the API pricing has been a major issue. We can see it started the same day march 16th.

The day I migrated from 2.5 Flash Lite to 3.1 Flash Lite Preview (dark orange → orange), my Gemini 3 Flash costs (light blue) also exploded — x4 overnight, despite lower token usage.

So I’m hit by two separate problems at once:

Forced migration from 2.5 Flash Lite to 3.1 Flash Lite Preview (2.5 will be deleted on March 31) — output price jumps from $0.34/M to $1.27/M
Unexplained cost explosion on Gemini 3 Flash Preview — nothing changed on my end, yet costs quadrupled

In both cases, the culprit is the same: text output token pricing.

I supsect a silent cost increase from Google side. Apparently we pay the same -HIGH- output cost for thinking and no thinking

Jonathon_Habersberge · March 20, 2026, 9:41am

As you can see with no change in usage my pricing went up 25-50x. I hope this is an error as it renders my product unviable.

MorganCDIP · March 20, 2026, 10:57am

I also noticed this increase in costs starting on March 16, without using a cache or Google search; I wrote a post about it: URGENT: The cost of API access has increased since March 16-17, 2026

Liz2k · March 20, 2026, 12:04pm

Please change “Group” to “SKU” (top-left filter) and check who eats that much and what it is related to?

junkx · March 20, 2026, 12:41pm

Since no one seems to care at Google, the only solution is make a tweet and tag them on X.

like its crazy how a spend limit update on march 16th broke the api cost calculation, and its not the 1st time. What are you doing at this point…

Liz2k · March 20, 2026, 12:59pm

You can explain anomaly to the Billing Support.

Liz2k · March 21, 2026, 8:15am

Next findinds:

I am writing to share new data and findings. Here is the second reason why I am absolutely certain that a “phantom” resource was stuck on your side, and why this case needs to be reviewed by your engineering team.

Currently, my Gemini API has been completely Disabled for three days (since March 18). Interestingly, day by day, those massive cache storage charges are now literally disappearing from my billing graphs.

If you look at the pre-disable graphs for March 15-17, the system was eating about $36 per day. However, after I force-disabled the API, the billing system started retroactively “recalculating” the cache costs back to the actual, correct data.

**Billing Support mentioned a 32-hour propagation delay, but for some reason, this synchronization lag took about 3 days to correct itself. And possibly it’s because I Disabled API. It started cleaning ONLY after I DISABLED API.
**
Furthermore, in a live production environment, I wouldn’t be able to simply disable the entire API to force the billing system to sync. So please check and fix this issue.

My current graph looks like (orange).

I really hope your engineers can fix this frightening delay issue for the Gemini Cache API.

junkx · March 22, 2026, 4:43pm

Still not fixed after 5 days…

Codeo44 · March 22, 2026, 5:07pm

Hi everyone,

I’m following up on the ongoing billing and cache issues mentioned in this thread. To help investigate this further, could you provide more details regarding your account setups?

Specifically, please confirm if you are signed in using your individual primary accounts or if you are accessing the API through a static or shared credential. Any additional information about your authentication method and project configuration would be very helpful.

Best regards,

Google LLC

junkx · March 22, 2026, 5:19pm

Hi,

Its not only a cache issue, its also Gemini flash 3 preview, 3.1 flash lite preview -and maybe more- using x4/x5 MORE OUTPUT TEXT tokens overnight. I didn’t touch any code or any prompt, same amount of call, same json output size.

Other devs are reporting here:

/

please confirm if you are signed in using your individual primary accounts or if you are accessing the API through a static or shared credential

I don’t know what that means. Im a solo developper with a basic Google cloud account. I created a project, then linked it in Google Ai studio, where i got my API key.

Hudson_Delalibera · March 23, 2026, 1:10am

Despite being on the “Free Tier” for experimentation, I am being hit with “Quota Exceeded” errors after only a few interactions. More alarmingly, the API costs associated with these requests have spiked unnaturally, far exceeding the expected billing for the volume of tokens processed.

Given that these are Preview models, I suspect there is a synchronization error between the experimental model’s token counting and the production billing/quota engine. This behavior is inconsistent with the standard limits previously applied to my account.

yeehaw43 · March 23, 2026, 10:19pm

I am experiencing this issue too on my production application,
Can confirm my application is using an API key assigned in Google AI Studio to make these requests, although one previous answer mentioning that force disabling the API meant that these charges disappeared this is not an option for me at all. The price continues to skyrocket despite request amounts being equal to before march 16th.

Any help would be much appreciated

Liz2k · March 24, 2026, 11:40am

Thank you for looking into this. I am happy to provide the exact details regarding my setup, as well as a timeline of a test I ran to track this billing anomaly.

Account & Setup Details:

Account Type: This is my individual, personal test project. I am the only developer working on it.
Authentication Method: I am accessing the Gemini API using a static API Key generated in Google AI Studio. I am not using OAuth for end-users or a Service Account JSON file.
Shared Credentials: The API key is strictly private and has never been shared.
Tech Stack: Rust backend using the reqwest HTTP client with connection pooling, calling the REST endpoint (generativelanguage.googleapis.com/v1beta).
Model: models/gemini-3.1-pro-preview

Cache Test & Logs:
To understand the issue, I ran a strict test using curl. I created a cache with a short TTL (60 seconds) and monitored the API and billing.

1. Creating the cache (TTL: 60s):

% ./cache_create.sh 60s
{
  "name": "cachedContents/frp5xlu6h48e7w2ui9g82imzcoods3e9gb3aonft",
  "model": "models/gemini-3.1-pro-preview",
  "createTime": "2026-03-22T09:34:49.342985Z",
  "updateTime": "2026-03-22T09:34:49.342985Z",
  "expireTime": "2026-03-22T09:35:48.044411163Z",
  "displayName": "",
  "usageMetadata": {
    "totalTokenCount": 4100
  }
}

2. Querying the cache list (Cache exists):

% ./cache_list.sh  
{
  "cachedContents": [
    {
      "name": "cachedContents/frp5xlu6h48e7w2ui9g82imzcoods3e9gb3aonft",
      "model": "models/gemini-3.1-pro-preview",
      "createTime": "2026-03-22T09:34:49.342985Z",
      "updateTime": "2026-03-22T09:34:49.342985Z",
      "expireTime": "2026-03-22T09:35:48.044411163Z",
      "displayName": "",
      "usageMetadata": {
        "totalTokenCount": 4100
      }
    }
  ]
}

3. Querying the cache list after TTL expired (Cache is gone):

% ./cache_list.sh
{}

Billing Observations Timeline:
After confirming the cache was gone via the API, I checked the billing dashboard every hour. Here is what happened with SKU E181-DFF8-56CF:

First 12 hours: The SKU did not appear at all (expected delay).
Next 12 hours: The SKU appeared, but showed a massive, highly inflated spike of 2801 hours (incorrect).
Next Day (24h+ later): The SKU retroactively auto-corrected itself down to 66.85 hours (correct).

My Hypotheses regarding the issue:
Based on these observations, I have two thoughts on what is happening on your backend:

Initial Miscalculation Bug: For some reason, the billing system initially drastically overestimates the cache duration (showing huge spikes on day 1 every time), and only triggers a reconciliation/correction job the following day to fix the data (on next day).
The “Zombie Cache” Scenario (My original issue): In the case where my bill was growing infinitely for 3 days, a cache was expired/deleted, but got stuck in your system. So the billing system assumed it was still active, charging me every hour. It seems the backend only triggered the reconciliation process after I completely Disabled the Gemini API. Once disabled, the system spent the next 3 days (day by day) retroactively correcting the previous 3 days of phantom growth back to the accurate values.

I hope these logs and timeline help your engineering team pinpoint the bug.

junkx · March 25, 2026, 12:10pm

After 8 days, still no fix from Google; hopefully the engineers are working on resolving the issue and we’ll get refunds + free credits.

Another dev is reporting the issue:

Topic		Replies	Views
Sudden increase in charges of Gemini API Gemini API bug , api , models , gemini-api , gemini-2-5	21	1033	February 13, 2026
Gemini API costs skyrocketed overnight (x4 in 3 days) Gemini API gemini-api	3	787	May 9, 2026
URGENT: Huge cost cache increase issue (2) Gemini API gemini	10	1577	June 10, 2026
URGENT: huge cost increase for cached content storage token hours Gemini API api , billing	5	205	September 22, 2025
Sudden Cost Spike with gemini-3-flash-preview Despite Decreased Usage (April 2026) Gemini API gemini-flash , gemini-3	28	1423	April 28, 2026

URGENT: Huge cost cache increase issue

My findings:

Next findinds:

Related topics