Ultra user with available quota still getting 503 MODEL_CAPACITY_EXHAUSTED — how is this acceptable?

I am a paying Google AI Ultra user, and I just received a server-side 503 with:

reason: MODEL_CAPACITY_EXHAUSTED
message: No capacity available for model claude-opus-4-6-thinking on the server

Trajectory ID: 7f734421-74b7-4e5e-8708-b4fb33a9…
TraceID: 0xb9462ef643f3..

The key issue is that my quota was still available.

So what exactly is Ultra supposed to mean in practice if a user can still have remaining quota and be unable to use the model because Google has no capacity? Quota that cannot actually be exercised is not meaningful access.

This is no longer just about temporary instability. It is about a paid premium plan failing at the point of delivery while customers are still being told they have top-tier access.

I would appreciate an official response that addresses the substance of the problem:

  • Why does this happen to Ultra users with available quota?

  • What practical benefit does “prioritised traffic” provide if the model can still be unavailable at request time?

  • Will affected users receive refunds or service credits for repeated capacity failures?

Please do not reduce this to generic troubleshooting, because this is clearly a server-side capacity issue.

1 Like

I strongly agree with this because I’m also paying $250 a month and prioritized traffic clearly doesn’t seem to mean anything at all.

Like, what a joke.It’s not even funny at this point. I can’t even progress with any of my development. At this price point, the steep quota cuts which have been undisclosed and the server outages when we are supposed to get prioritized traffic makes me feel extremely disappointed and disrespected.