[Bug/Help] Constant 503 "MODEL_CAPACITY_EXHAUSTED" for Pro models, but Flash works fine? Quota UI inconsistency

Hi everyone, I’ve encountered a persistent issue with the Antigravity IDE and wanted to see if anyone has a solution or if the team could look into it.

The Issue: I’m working on a large codebase . Over the last few days, any request using advanced models (Gemini 3.1 Pro (High) or Claude) fails immediately with a 503 Service Unavailable: MODEL_CAPACITY_EXHAUSTED error.

What’s strange:

  1. Quota UI shows full: My Antigravity dashboard shows my Quota for these models is completely full, and I have available AI Credits. The UI implies I should be able to use them.

  2. Flash works perfectly: If I switch the model to Gemini 3 Flash in the IDE, it works instantly without any 503 errors.

  3. Local environment is clean: I have completely cleared all local caches (globalStorage, workspaceStorage, Windows Credentials, etc.) and tried different networks. The issue persists.

  4. Colleague’s account works: A colleague of mine logged into their account on my exact machine/project, and their Pro models worked flawlessly.

My Question: It seems like my specific account has hit some sort of undocumented long-term backend rate limit (FUP?) that isn’t reflected in the front-end Quota UI.

  1. Is there a monthly rolling limit for large-context indexing that isn’t shown on the dashboard?

  2. If an account hits this limit, is it standard behavior to fallback to Flash while returning 503s for Pro models?

  3. Are there best practices or .antigravityignore configurations recommended for large projects to prevent hitting these limits too quickly?

1 Like