Urgent: Massive Billing Spikes- Unexpected Usage Spikes and Charges for Unused Model

Hi everyone,

I am experiencing some severe billing anomalies with the Gemini API and wanted to see if others are seeing similar patterns in their usage reports.

The Issues:

  1. Billing for Unused Models:
    I switched my application to use gemini-2.0-flash in January. I have confirmed logs showing that from Jan 7th to Jan 13th, we only sent requests to the 2.0 model.
    However, my Google Cloud billing report shows a sudden spike on Jan 11th for gemini-2.5-flash (approx. 900 requests), resulting in a charge of ~ $17. A similar spike happened on January 3rd also. I am providing 2 screenshot of this issue for context-

  2. Anomalous Volume in December:
    Prior to the switch, during December, I saw massive, sudden spikes in API requests for gemini-2.5-flash (ranging from 2,000 to 5,000+ requests). These spikes showed very large input token counts and did not match the actual traffic going to my application. My application does not make that much api request as shown in dashboard from AI studio. It costs more than $600 in just in December. Even in November there was some unexpected spikes in usage.

Summary:
Since the “phantom” spikes continued in January on a model version I was no longer using, it strongly suggests the high volume in December was also an anomaly rather than organic traffic which I predicted.

Has anyone else seen usage reported for model versions they aren’t calling, or sudden unexplained spikes in request counts?

How may I get a refund for this and where should I report this stuff?

Hey @Aaditya_Sikder, you can request a refund at: Resolve Cloud Billing issues  |  Google Cloud Documentation

If the costs aren’t coming from any of your usage, I would strongly recommend rotating your API key as it’s possible it was leaked

Thanks for the reply.

We have investigated carefully and sure that the usage was not due to leakage of API key (though we will change as you suggested). For example if you look at the “input token per model” or “request per model” in the first screenshot, the it shows use of both models at same time using gemini-2.5-flash also for some time for no reason. But later usage of our application shows no usage of gemini-2.5-flash. This is sudden massive increase of about $17 even if we have made request or token or even that specific even close to that! This is just one example.

Also should I directly ask for Refund or contact Cloud Billing support for this issue?