Antigravity caps all models at 64k output tokens, but Claude Opus natively supports 128k

I ran some tests today to figure out why my Claude Opus sessions were truncating more often than usual. Both Gemini 3.1 Pro and Claude Opus hit the same hard ceiling in the IDE: 65,535 output tokens.

For Gemini, that is basically the model’s full capacity. Gemini 3.1 Pro maxes out at 65,536 output tokens natively, so the IDE is giving it everything.

For Claude Opus, that is half. Both Opus 4.6 and the new 4.7 support 128,000 output tokens through the Anthropic API. Inside Antigravity, they are capped at 64k. That is a lot of capability left on the table, especially for large refactors and multi-file generation where Opus would otherwise complete in one pass but instead gets killed mid-output.

I also noticed a 1,024 token cap on thinking traces. Opus supports up to 128k thinking tokens natively. A 1,024 ceiling on reasoning is extremely restrictive for complex agentic workflows where the model needs to plan across multiple files and dependencies.

Two questions for the team:

  1. Can the Claude output cap be raised to match the model’s native 128k, or at minimum brought closer to it? The infrastructure already handles 64k payloads for Gemini, so the parser is not the bottleneck.
  2. Was Opus 4.7 deployed in the IDE recently? I am seeing more frequent truncations starting today, which would make sense if 4.7 rolled out and its higher verbosity is colliding with the same 64k ceiling faster than 4.6 did.
1 Like