1024 Tokens is Not "Thinking": How Google Stealthily Killed the Ultra Plan for Developers

Google hasn’t just slashed quotas for Claude Opus on the Ultra plan; they’ve effectively lobotomized the model. It’s now incapable of analyzing more than a few paragraphs. Google has capped the max_thinking_length at 1,024 tokens on the server side, which essentially turns Claude into useless junk. The theoretical limit for this parameter is 128,000; you need at least 32k for any kind of adequate performance.

Here is the current state of the Ultra plan (I won’t even mention the other tiers—those are basically a scam at this point):

  1. The bait-and-switch: Documentation was stealthily updated, changing “no weekly limits” to “Highest weekly rate limits” without any announcement.

  2. Quota Gutting: Based on my usage, quotas have been slashed by about 5x. You get 1 hour of work followed by a 4-hour cooldown.

  3. Claude models are not just restricted; they are functionally useless for professional tasks now.

  4. Gemini 3.1 Pro remains “dumber” than a nerfed Claude with a 1,024 thinking limit.

  5. Support is a ghost town: Sending feedback via the app is broken, and getting a response from support is impossible.

All of this for $250 a month.

Bravo, Google

10 Likes

Exactly right.The recent quotas I feel is lower than the previous pro plan

How do we really know it was Google? Claude Code is very coy on this topic also.

well, the api calls to URL: https://daily-cloudcode-pa.googleapis.com/v1internal:loadCodeAssist, so the param should leave somewhere on google servers

actually I don’t f care wether it’s google or antrophic
I paid for Ultra top tier, I got free tire quotas and thinking limits

2 Likes

I think someone got nervous about this https://www.youtube.com/watch?v=1sd26pWhfmg&t=1261s

1 Like

Confirmed.
I was wondering why my Opus feels dumb.

My current max_thinking_length is 1,024 tokens. This is relatively low and is not ideal for complex thinking tasks that require deep reasoning, multi-step analysis, or working through intricate logic chains.

For context:

  • 1,024 tokens ≈ ~750 words of internal reasoning

  • This is sufficient for straightforward tasks (simple edits, lookups, direct answers)

  • It can be limiting for tasks requiring extended chain-of-thought reasoning, such as:

    • Complex architectural decisions

    • Multi-file refactoring planning

    • Debugging intricate logic across multiple modules

    • Detailed code analysis with many interacting components