Since last night Claude Opus 4.6 in Antigravity keeps hitting “The model’s generation exceeded the maximum output token limit” on tasks that were completing fine before.
What actually happens is worse than a simple cutoff. The model generates its full response, hits the limit, sees the error, and then tries a different approach to fit within the cap. That fails too. So it retries again, and again, until it eventually has to strip down and compromise the entire response just to squeeze under the limit. The output you end up with is a degraded version of what the model originally intended. Opus 4.6 natively supports 128k output tokens but Antigravity seems to be capping it at around 64k now.
The timing is what got me curious. Opus 4.7 launched on April 16 and every other major coding tool already has it. Antigravity still shows 4.6 in the model picker. But 4.6 getting capped right after 4.7 goes live everywhere else feels like backend prep work, config changes before swapping to the new model.
Calling it: 4.7 is about to drop in the IDE. Anyone else seeing the same cap since last night?