Persistent MODEL_CAPACITY_EXHAUSTED on claude-opus-4-6-thinking, Ultra Plan Unusable Since Day 1

,

Plan: Antigravity Ultra
Model: claude-opus-4-6-thinking
Issue since: Day 1 of plan activation
Severity: Critical: Plan is non-functional for its primary use case

Root Cause (confirmed via debug info)

Every agentic session fails with a hard capacity error from Google’s Cloud AI Companion backend:

Error Code: 503
Reason: MODEL_CAPACITY_EXHAUSTED
Domain: cloudcode-pa.googleapis.com
Message: No capacity available for model claude-opus-4-6-thinking on the server
Status: UNAVAILABLE

This is NOT a user-side rate limit or quota exhaustion. The cloudcode-pa.googleapis.com backend has no available infrastructure capacity for claude-opus-4-6-thinking during sessions.

Debug Info

Trajectory ID: 098f0cdb-8324-4d98-844f-487b42de0...
TraceID: 0x71dda22e8e6be..
X-Cloudaicompanion-Trace-Id: 71dda22e8e6bec2e
Date: Fri, 29 May 2026 11:14:57 GMT
Server-Timing: gfet4t7; dur=13084 (backend waited ~13s trying to acquire capacity before returning 503)

Observed Failure Pattern

Worked for 12s → Worked for 1s → Error: high traffic / 503
Worked for 29s → Worked for 1s → Error: high traffic / 503
Worked for 15s → Worked for 1s → Error: high traffic / 503
Worked for 12s → Worked for 1s → Error: high traffic / 503

The ~13s Server-Timing duration directly explains the “Worked for Xs” pattern, the backend holds the connection trying to acquire capacity, then fails. This happens on every single agentic turn, every session, since day 1 of plan activation.

Steps to Reproduce
Use Antigravity Ultra plan with Claude Opus 4.6 (Thinking) model

  1. Start any multi-step agentic task (file editing, bash execution, code generation)
  2. Observe: agent works briefly, then terminates with “high traffic” / 503 error
  3. Retry → same failure within 1–29 seconds
  4. Repeat indefinitely — no session has ever completed successfully

Expected Behaviour

Ultra plan should guarantee sufficient backend capacity allocation to complete agentic sessions, particularly for the flagship model advertised as a core part of the plan.

Actual Behaviour

Google’s cloudcode-pa backend returns MODEL_CAPACITY_EXHAUSTED on every request after initial processing begins. The Ultra plan receives no priority routing or capacity reservation compared to lower tiers.

Community Impact

This is not an isolated issue. A search for “MODEL_CAPACITY_EXHAUSTED Google Antigravity” shows multiple open threads with identical errors:

  • “[Bug] Account-specific 503 MODEL_CAPACITY_EXHAUSTED only on claude-opus-4-6-thinking” (11d ago)

  • “[Bug] Persistent HTTP 503 MODEL_CAPACITY_EXHAUSTED loop on specific account” (29d ago)

  • “Google Antigravity - HTTP 503 MODEL_CAPACITY_EXHAUSTED on Ultra plan for 1 week” (26d ago)

  • “Antigravity Claude Opus 4.6 Thinking — persistent 503 MODEL_CAPACITY_EXHAUSTED” (ongoing)

This has been reported for nearly a month with no resolution. Ultra plan subscribers are paying for a model that is structurally unavailable on Google’s infrastructure.

Requests

  1. Immediate escalation to Google infrastructure team for cloudcode-pa capacity allocation on claude-opus-4-6-thinking

  2. Official acknowledgement and status update from the Antigravity team

  3. Prorated refund or plan extension credit for the full affected period for all impacted Ultra plan users

  4. ETA on when Ultra plan users can reliably use this model

1 Like