Billing Error in excess of spending cap caused by 429 Errors

Ended up with a large bill which has exceeded my spending cap, due to 429 errors in the CLI, which silently retries.

As Gemini diagnosed:

The Cause: The “Tool-Use” Inference Pool Bottleneck

When you activate a Model Context Protocol (MCP) server like your Home Assistant bridge, the Gemini CLI behaves differently than a standard chat window. It doesn’t just send your text prompt; it bundles the entire functional schema of your smart home entities, switches, and sensors into the payload.

Google’s backend handles these complex tool-heavy payloads using a distinct, isolated server pool dedicated entirely to function calling.

  • Right now, the global server capacity for processing tool/function-calling requests on the Gemini 3 platform is hitting massive traffic spikes.

  • When the backend server pool is overwhelmed, it drops a 429: No capacity available on the server error.

  • Because the Gemini CLI framework is built to quietly retry failed requests in the background, it traps the interface in that endless, silent “Thinking…” freeze instead of crashing out to your terminal.

    Also:

    You have every right to be completely furious. That is an absolute nightmare scenario, and you are 100% correct about what just happened to your setup.

    When you moved to a true paid Tier 1 API key, it stripped away all safety nets. Combined with how the gemini-cli framework handles Model Context Protocol (MCP) servers, it essentially turned your terminal into a runaway token incinerator.

    Why the CLI Burned Your Wallet

    The current design of the Gemini CLI has a catastrophic flaw when paired with a massive smart home setup like yours:

    • The MCP Payload Bloat: Home Assistant exposes dozens of entities, rooms, and devices. The CLI doesn’t just pass your text; it dumps the entire structural schema of those tools into the context window.

    • The Infinite Loop Multiplier: Because of that NumericalClassifierStrategy and capacity routing bug we found in your debug logs, the CLI was hitting backend errors. Instead of crashing and warning you, the CLI’s background loop fell into an automatic, silent exponential-backoff retry.

    • The Cost: Every single time it retried in the background (while your terminal was stuck on “Thinking…”), it re-sent your full text history plus the massive Home Assistant schema. It was burning hundreds of thousands of tokens per second behind the scenes without outputting a single word, instantly devouring your paid budget until it slammed into your billing cap.

      it continues:

      Step 2: Dispute the Charges / Request a Credit

      Because this massive token consumption was entirely driven by an unhandled, silent framework retry loop (a known client-side bug) rather than your actual conversational inputs, you have a solid case to request a billing adjustment.

      Other developers hitting these agentic loop traps have successfully opened support tickets.

      1. Go to the Google Cloud Billing Console.

      2. Select the Billing Account tied to your AI Studio profile.

      3. Scroll down to the Support tab and open a ticket.

      4. State explicitly: “The tool-use framework in the official @google/gemini-cli client encountered an unhandled 429 capacity routing error and fell into an infinite background retry loop. It repeatedly compiled and resent massive local MCP schemas without user consent or terminal output, completely exhausting my billing cap via autonomous client-side looping.”

      The CLI’s current handling of heavy remote MCP servers is fundamentally broken for live production billing keys.

      Both the billing support AI and human agents don’t appear to understand and claim the CLI is “out of scope” for billing support queries.

      Well, the CLI caused a large bill that blew my spending cap, so where do I go from here?

      Currently this CLI took is clearly unusable while these false 429 errors persist.

      Incidentally, I haven’t hit a single quota on my projects, these are false 429 errors. They were annoying before, but now they are costing me money.

It doesn’t help when the billing support agent becomes unresponsive. I’m trying to fix a major billing fault here.

Hi @Ben_Rose

I have DM’ed you for more details

Hi Ben,

Thanks for sharing the details. I understand the frustration — the Gemini CLI retry loop caused unexpected charges and that’s on us to fix.

What I’ve done on my side:

  1. Stopped further charges by disabling the MCP/Home Assistant server in the CLI config.
  2. Prepared a fixed config with retry limits so this won’t repeat.

What you need to do now for the charges:
Google has credited other users for this exact issue. Please open a billing dispute ticket with Google Cloud using the message below:

Steps:

  1. Go to: Google Cloud Console > Billing > Support > Create case
  2. Select: Billing > Dispute charges
  3. Copy and paste the message in the box below:

Replied with details

It happened again, I even have fictitious usage for a futuristic date. It’s the 19th, how do I have usage for tomorrow already?

honestly, it takes over an hour online to Billing Support to get credit for an error caused by Google. Not good enough.

Any update for this please, my project has now been on hold for two weeks

This project is still unusable