Google AI Pro Subscription - Antigravity Quota Not Working as Advertised (10-Day Lockout Instead of 5-Hour Reset)

I’m having the same issue.

Mine is fixed, I don’t know if this can help someone else but:

I was having this issue a few weeks ago. I had used it without issues and rarely burned through more than 1x 5hr quota a day, then suddenly my quotas were disappearing faster than I could get anything done. We’re talking 6 prompts, not even major refactor. Then, multi-day lockout penaltybox and the most frustrating part is their site is terrible and I feel like info about their quota system is purposely obfuscated.

However, I FOUND what MY specific issue was. I had accidentally enabled the Stitch MCP.

Once I removed this from my setup, it went back to normal and honestly I have not been below 40% on any 5hr quota window in over 7 days now and I feel like my usage is back to what they consider “acceptable” for the consumer subscriptions.

I don’t know if this info can help anyone else but I hope that it does. I know not all problems are created by the same thing, but check to make sure you didn’t enable an MCP server that is blowing up your quotas. Goodluck homies!!! I know it’s frustrating…I was ready to break my monitors at one point…..

3 Likes

# Token Quota Management: What Developers Can Do Today & What Google Needs to Fix

**Date:** March 4, 2026

**Context:** Antigravity (Gemini-powered AI coding assistant)

**Author:** Developer feedback from real-world usage

## The Problem

Developers using Antigravity are burning through their token quotas without visibility into consumption, receiving vague errors when limits are hit, and inadvertently making things worse by retrying failed requests. This is a widespread issue affecting the developer experience and eroding trust in the tool.

### The Failure Loop

```

Developer sends a request

→ Request fails (quota exceeded)

→ Antigravity shows a generic error with a "Retry" button

  → Developer retries (doesn't know it's a quota issue)

    → Retry fails again (still over quota, possibly consuming more)

      → Developer retries 2-3 more times

        → Developer gives up, starts a new task

          → New task also fails → finally sees quota limit message

            → Developer has now wasted retries on something that was never going to work

```

This loop is destructive:

- Each retry re-sends the **full conversation history** — the most token-expensive payload

- The developer has no signal that the issue is quota, not a transient error

- By the time the explicit quota message appears, additional quota has been consumed on futile retries

## Part 1: What We Built (Developer-Side Mitigations)

Since agents can’t see quota status and the platform doesn’t expose it, we built a **prevention-first system** using Antigravity’s own agent infrastructure (AGENTS.md, workflows, and skills). These are agent-readable instructions that any AI coding assistant in the project will follow.

### Token Economy Skill

A cross-cutting skill (`.agent/skills/token-economy/SKILL.md`) that instructs agents to minimize their own consumption:

| Principle | Implementation |

|—|—|

| **Read lean** | Agents use `view_file_outline` and `grep_search` before reading full files. They read line ranges, not entire files. |

| **Don’t re-read** | Agents track which files they’ve already read in a session and reference earlier reads instead of re-reading. |

| **Flag heavy tasks** | Before starting a task that requires reading 10+ files, agents estimate the cost and ask the developer whether to proceed with the full scope or a focused subset. |

| **3-strike debugging rule** | If 3 debugging approaches fail, agents stop and report what they tried rather than iterating endlessly. |

| **Progress checkpoints** | For large tasks, agents report at ~25% completion and ask whether to continue, giving the developer a natural exit point. |

| **Scope creep alerts** | When a task expands beyond its original scope, agents flag it and let the developer decide whether to continue or defer. |

| **Context size awareness** | Agents notice when many documents are open in the editor (each one adds to the context payload sent every turn) and suggest closing irrelevant ones. |

### Planning Mode vs Fast Mode Guidance

We documented clear criteria for when tasks need Planning Mode (heavy token use, justified by complexity) versus Fast Mode (lighter, checklist-driven), so developers can make informed mode choices:

- **Fast Mode:** Single-file edits, boilerplate, formatting, targeted bug fixes, checklist-driven audits

- **Planning Mode:** Multi-file refactors, ambiguous requirements, architectural decisions, high-stakes changes

### Structured Workflows with Turbo Execution

Repeatable procedures (codebase audits, post-activity checks) are codified as workflow files with `// turbo` annotations that auto-execute safe read-only commands, reducing the back-and-forth approval overhead that inflates conversations (and therefore token usage).

### Self-Improvement Protocol

Agents are instructed to identify repeatable patterns during their work and propose new workflows/skills to the developer, so future sessions start faster and consume fewer tokens on solved problems.

## Part 2: What Google Needs to Fix

The developer-side mitigations above reduce consumption, but they cannot solve the fundamental UX failures in Antigravity’s quota handling. These require platform-level changes.

### Critical Fixes

#### 1. Distinguish Quota Errors from Other Failures

**Current:** All failures show the same generic error message with a retry button.

**Needed:** When the API returns a quota-exceeded response (HTTP 429 or equivalent), display a specific message:

> :warning: **Token quota reached.** Your daily/monthly limit has been exceeded.

> Resets at: [timestamp].

> Options: Switch to a lighter model, start a shorter conversation, or wait for reset.

This is a straightforward UI change. The API already returns distinct error codes for quota vs. other failures — the client just isn’t differentiating them.

#### 2. Disable or Warn on Retry for Quota Errors

**Current:** The retry button is always available, even when retrying is guaranteed to fail.

**Needed:** When the failure is quota-related:

- Either disable the retry button entirely

- Or show a warning: *“Retrying will consume additional quota and is unlikely to succeed. Are you sure?”*

Each retry re-sends the full conversation context. For a long conversation, a single retry attempt could consume thousands of tokens just to get rejected again.

#### 3. Expose Quota Usage to Developers

**Current:** Developers have zero visibility into their token consumption until they hit the wall.

**Needed:** A lightweight quota indicator, such as:

- A usage bar in the Antigravity UI (e.g., “73% of daily quota used”)

- A warning threshold (e.g., “You’ve used 90% of your daily quota. Consider shorter conversations or lighter tasks.”)

- Post-conversation summaries showing approximate tokens consumed

Even a rough approximation would dramatically improve developer decision-making. Developers can’t optimize what they can’t measure.

#### 4. Expose Quota Status to Agents

**Current:** Agents have no way to know they’re close to a quota limit. They can’t warn developers proactively.

**Needed:** Include a signal in the agent’s system prompt or metadata, such as:

- `quota_remaining: low | medium | high`

- `estimated_turns_remaining: ~5`

This would allow agents to naturally adjust their behavior:

> *“I notice your quota is running low. I’ll keep this response concise. For the remaining work, I’d recommend [prioritized list].”*

This is the single highest-leverage change. It turns agents from quota-blind tools into quota-aware partners.

### Nice-to-Have Improvements

#### 5. Conversation Cost Estimates

Before starting a new conversation or continuing a long one, show an estimated cost:

> *“This conversation has consumed ~45K tokens. Continuing will cost approximately 2K tokens per exchange. Starting a fresh session would reset the context cost.”*

#### 6. Model Fallback Suggestions

When quota for one model is exhausted, suggest available alternatives:

> *“Gemini 2.5 Pro quota reached. Gemini 2.0 Flash is still available and works well for code review, single-file edits, and Q&A.”*

#### 7. Task-Level Token Tracking

Allow developers to see which tasks consumed the most tokens, so they can learn which patterns to avoid:

> *“Your heaviest task today was ‘Full codebase audit’ (38K tokens). Consider using the /audit workflow with targeted scope next time.”*

## Summary

| Layer | What Exists Today | What’s Needed |

|—|—|—|

| **Agent behavior** | :white_check_mark: Token Economy skill, lean patterns, checkpoints, 3-strike rule | Agents need quota visibility from the platform |

| **Developer decisions** | :white_check_mark: Planning vs Fast mode guidance, workflow turbo execution | Developers need usage dashboards |

| **Error handling** | :cross_mark: Generic error + retry button | Specific quota error messages, retry warnings |

| **Quota visibility** | :cross_mark: None until hard failure | Usage bar, warnings at 90%, post-conversation summaries |

| **Agent quota awareness** | :cross_mark: Agents are blind to quota | System-level signal (`quota_remaining: low`) |

The developer-side mitigations are live and working. They reduce consumption meaningfully — but they’re a workaround for the absence of platform-level transparency. The UX fixes described above would take Antigravity from a tool that punishes developers for not knowing their limits to one that partners with them to use tokens wisely.

*This document was produced collaboratively between a developer and Antigravity (Gemini) during a real working session where token quota management was identified as a critical usability issue affecting daily productivity.*

2 Likes

Got the problem too, it was previously reset every 5 hours, but since this is the first time i use up all my quota with Opus. Got locked down 105 hours

1 Like

Turned on the Antigravity today and all of sudden 3.1 pro and claude are exhausted even though when I turned off the pc yesterday they were at 60%.

Same issue. Went to bed with pro quota looking good this morning it was on a 5 day cooldown

2 Likes

exactly same issue here. how to solve this? does anyone from google can help?

2 Likes

Same here—it usually resets after 5 hours, but now it’s been 5 days! Please fix this; otherwise, there’s no point in purchasing a Pro subscription

2 Likes

I am having the exact same issue regarding this. i hope we can get a fix as soon as possible. Or at least some clarification.

Also, if any google employees are seeing this, check out this post by samiralibabic: Google AI Pro Subscription - Antigravity Quota Not Working as Advertised (10-Day Lockout Instead of 5-Hour Reset) - #10 by samiralibabic

Same here, 5 day lockdown. Incredibly annoying and it makes Antigravity pretty much unuseable.

2 Likes

did you fix it or the issue is still there

did you solve this or the issue is still there

1 Like

still the same. Claude quota still need hundred hours to refresh.

I used my Opus quota today after a while, and after five hours, it was only restored to 60%. So, no more even two full quotas for a week.

I am experiencing the same issue as a Google AI Pro subscriber.

During a coding workflow using Gemini 3.1 Pro, the quota was suddenly exhausted and the cooldown jumped to multiple days.

Current Model Quota panel shows:
• Gemini 3.1 Pro (High) → ~94 hour refresh
• Gemini 3.1 Pro (Low) → ~94 hour refresh
• Claude Sonnet 4.6 → ~98 hour refresh
• Claude Opus 4.6 → ~98 hour refresh

However, I haven’t used any Claude models in this session.

The session involved an SRS-driven development workflow where Gemini was executing phased implementation tasks and pushing results to GitHub.

The quota was exhausted around Phase 4 after roughly one hour of usage.

Since this is a Google AI Pro plan, the multi-day cooldown appears inconsistent with the documented limits.

Last night there were 2 hours left until the Gemini model, now this morning there are 129 hours left.
You’ve got to be kidding me, I have a pro plan.
Time to say goodbye…

Same issue here, bumping this up +1

Same issues [BUG AG-859] Pro Subscriber 140-hour Lockout. Paying Pro subscriber in Scotland and that the support portals are returning 404 errors as well. Requesting urgent response please.

1 Like

I’ve been using Antigravity for 2 days, and the quota even for Gemini Pro will reset in 5 days??? You must be kidding. Completely useless, I’m totally dissatisfied.

1 Like

We appreciate you bringing your concerns regarding the quota to our attention. Please be assured that I have shared this feedback with our internal team.

As we balance giving the best possible quotas and maintaining fairness between users, especially under incredible demand, we’ve established weekly limits for all models. These limits do not apply to Google AI Ultra, which continues to be the best plan for power developers! See Google Antigravity for the latest.

1 Like