Google hasn’t just slashed quotas for Claude Opus on the Ultra plan; they’ve effectively lobotomized the model. It’s now incapable of analyzing more than a few paragraphs. Google has capped the max_thinking_length at 1,024 tokens on the server side, which essentially turns Claude into useless junk. The theoretical limit for this parameter is 128,000; you need at least 32k for any kind of adequate performance.
Here is the current state of the Ultra plan (I won’t even mention the other tiers—those are basically a scam at this point):
The bait-and-switch: Documentation was stealthily updated, changing “no weekly limits” to “Highest weekly rate limits” without any announcement.
Quota Gutting: Based on my usage, quotas have been slashed by about 5x. You get 1 hour of work followed by a 4-hour cooldown.
Claude models are not just restricted; they are functionally useless for professional tasks now.
Gemini 3.1 Pro remains “dumber” than a nerfed Claude with a 1,024 thinking limit.
Support is a ghost town: Sending feedback via the app is broken, and getting a response from support is impossible.
Confirmed.
I was wondering why my Opus feels dumb.
My current max_thinking_length is 1,024 tokens. This is relatively low and is not ideal for complex thinking tasks that require deep reasoning, multi-step analysis, or working through intricate logic chains.
For context:
1,024 tokens ≈ ~750 words of internal reasoning
This is sufficient for straightforward tasks (simple edits, lookups, direct answers)
It can be limiting for tasks requiring extended chain-of-thought reasoning, such as:
Complex architectural decisions
Multi-file refactoring planning
Debugging intricate logic across multiple modules
Detailed code analysis with many interacting components