[Bug] MODEL_CAPACITY_EXHAUSTED (503) on Workspace AI Ultra accounts only - evidence this is a routing/auth bug, not genuine capacity exhaustion

Summary

Antigravity agent terminates with MODEL_CAPACITY_EXHAUSTED (HTTP 503) on Google Workspace AI Ultra accounts during ~19 out of 24 hours daily. Community reports confirm personal Gmail accounts on the same Ultra tier remain functional during the same time windows. This disparity is evidence of a backend bug in how Workspace accounts are routed or authenticated, not genuine server capacity exhaustion.


Why This Is a Bug, Not a Capacity Problem

If the server capacity for a model like claude-opus-4-6-thinking were genuinely exhausted, every user would be affected equally — personal accounts, Workspace accounts, free tier, paid tier. That’s how capacity works. But that’s not what’s happening:

Account Type Behavior During Peak Hours
Personal Gmail + AI Ultra Works — agents complete tasks
Workspace + AI Ultra Fails — 503 MODEL_CAPACITY_EXHAUSTED on nearly every invocation

This Workspace-only failure pattern is the smoking gun for a bug. Here’s what’s likely going wrong on the backend:

1. Separate capacity pool routing
Workspace accounts likely route through an enterprise/organizational API path that terminates at a different (smaller) capacity pool than personal accounts. The capacity isn’t exhausted globally — it’s exhausted in the Workspace-specific pool because it was provisioned for lower traffic. Personal accounts hit the main production pool, which has sufficient capacity.

2. Authentication chain overhead → timeout → misclassified error
Workspace accounts have additional auth complexity — organizational policies, admin-enforced OAuth scopes, domain-level restrictions. This auth chain adds latency to every request. Under load, this extra latency may cause the request to exceed an internal timeout window, and the backend returns “capacity exhausted” as a fallback error rather than the actual root cause (auth timeout, org-policy evaluation delay, or token refresh failure).

3. Tier/subscription resolution failure
When the backend determines what priority tier a user belongs to, Workspace accounts resolve through a different identity path than personal accounts. The tier lookup may be failing silently for Workspace users — the system can’t confirm the user is Ultra, so it defaults to treating them as lower priority. Under any load at all, these “unknown tier” requests get rejected first.

Any one of these would produce exactly the symptom pattern we’re seeing: Workspace fails while personal works, the failure is time-of-day correlated (load-dependent), and the error message says “capacity” when the real problem is routing, auth, or tier resolution.


Error Details

HTTP 503 Service Unavailable
Model: claude-opus-4-6-thinking
Reason: MODEL_CAPACITY_EXHAUSTED
Endpoint: cloudcode-pa.googleapis.com

Trace IDs:

  • Trajectory ID: `fd29fb59-8544-4a9d-a5ab-13aa0959c
  • TraceID: `0x5daa1b82d9eac4
  • X-Cloudaicompanion-Trace-Id: 5daa1b82d9eac480
  • Timestamp: 2026-04-16T14:23:23 GMT

Full error payload:

{
  "error": {
    "code": 503,
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.ErrorInfo",
        "domain": "cloudcode-pa.googleapis.com",
        "metadata": {
          "model": "claude-opus-4-6-thinking"
        },
        "reason": "MODEL_CAPACITY_EXHAUSTED"
      }
    ],
    "message": "No capacity available for model claude-opus-4-6-thinking on the server",
    "status": "UNAVAILABLE"
  }
}

Timeline

Period Workspace Account Usability
~2 weeks ago Completely unusable — 503 on every attempt, 24/7
~3 days ago Still completely unusable
Last 2–3 days Slight improvement — works during ~1:00–6:00 AM PKT (off-peak). Fails all other hours.
Today (16 Apr 2026) Same — ~5-hour usability window at 1 AM. Premium models fail 19 hours/day.

The improvement over the past 2–3 days is appreciated — the team has clearly done something. But going from “100% broken” to “works at 1 AM” isn’t where this needs to be for a paid product.


Bonus: Minor UI Bug in Agent Execution Log

While documenting this, I also noticed a UI issue in the agent execution log when the agent terminates:

  1. Duplicate error text rendering: The collapsible “Error” header shows the full error message in bold, and when expanded, the sub-line shows the exact same text again in normal weight. The expanded content should show additional detail (stack trace, error code, trace ID), not repeat the header verbatim.

  2. Misleading “Worked for 1s” label: When the agent crashes immediately, the status reads “Worked for 1s.” The agent didn’t “work” — it failed. This should read “Failed after 1s” or “Terminated after 1s.” “Worked” implies successful execution.


What I’m Asking For

  1. Investigate the Workspace routing path. The personal-vs-Workspace disparity isn’t a capacity problem — it’s a bug. The trace IDs above should help locate where Workspace requests diverge from personal ones on the backend.
  2. Fix tier resolution for Workspace accounts. Ensure Workspace Ultra subscriptions are correctly recognized and given the same priority as personal Ultra.
  3. Accurate error messages. If the real failure is auth timeout or tier lookup failure, surface that instead of “capacity exhausted.”
  4. Don’t count failed requests against quota. Multiple users report that 503 failures still consume daily quota allocation.

Acknowledgment

The improvement over the last few days is real and noticed. The team is clearly working on this. I’m filing this because I want Antigravity to succeed — but for those of us on Workspace accounts, it needs to work during actual working hours, not just at 1 AM.

Environment: Windows, Antigravity IDE (VS Code), Google Workspace AI Ultra, Pakistan (UTC+5)

3 Likes