This 503 issue is getting out of hand. It feels like a 24/7 problem now, not just during peak hours.
The core issue seems to be a complete lack of queue management. Instead of holding the request in a queue for a few seconds, the API just drops the connection and throws a 503. This completely breaks the workflow. And the token waste is real—even if we just prompt “continue” after a fail, the system prompt and history are still sent and re-processed. Even with context caching, we’re taking a hit on cache read tokens for every single retry.
To make matters worse, paying for the Ultra tier doesn’t seem to give any priority routing. We’re hitting the exact same 503 walls, making the service practically unusable right now. They really need to implement a standard queuing system instead of just dropping requests.
Yep, and the situation kept getting worse, with no improvement. My credit limit was being used up very quickly, and then I created another Google account to buy Ultra, and then Google restricted my new Ultra account… What on earth is going on???
I’m really tired… I’ve temporarily switched to OpenAI. I subscribed to the basic plus version, and it’s been very reliable, which is normal. It’s not like I subscribed to Ultra, but my limit was reached within an hour, accompanied by constant service unavailability…
It getting worst since yesterday again, I subscribed Ultra plan for months.
I have to finish my project today, my client wouldn’t accept if I told them I am busy now and wait me for a while
I am paying for Google AI Ultra. Absolutely unusable service. They should pay me for the frustration at this point. I am not supposed to worry about this BS. I paid for a functional service.
"code": 503,
"details": \[
{
"@type": "type.googleapis.com/google.rpc.ErrorInfo",
"domain": "cloudcode-pa.googleapis.com",
"metadata": {
"model": "gemini-3-flash-agent"
},
"reason": "MODEL_CAPACITY_EXHAUSTED"
}
\],
"message": "No capacity available for model gemini-3-flash-agent on the server",
"status": "UNAVAILABLE"
I have realized that the issue is only affecting accounts with subscription…or connected to a family with subscrition.
I have used two total subscriptionless accounts and they have been working fine with Gemini 3.1 Pro successfully…. but the moment I switch back to my subscription accounts both Ultra and Pro…or any other Google One plan, it starts to give this traffic error.
We’re finding new ways to share more capacity with the Antigravity community. During windows where we have unused capacity, currently in the late afternoon PST, you’ll notice that your baseline quota stretches further. You’ll get more requests and less downtime within your existing plan.
As global demand shifts, we’ll continue to adjust these windows to pass any available capacity back to you. We’re committed to expanding access and ensuring every developer has the capacity they need to keep building.