Hey everyone, I think it’s time we have a serious, honest conversation about what’s actually going on with Gemini right now. As developers and power users, we’re on the front lines of this ecosystem, and the silent degradation of the “Deep Think” mode over the past couple of months is something we can’t afford to ignore anymore.
When Deep Think launched back in February 2026, it was supposed to be a massive leap forward. The promise was that Google was scaling inference-time compute—giving the model the resources to build internal reasoning chains, verify its own work, and evaluate multiple solution paths in parallel before spitting out an answer. It sounded great on paper. But let’s be real about the day-to-day reality we are experiencing right now: the core reasoning capabilities have completely nosedived.
Here is what is actually happening under the hood:
1. The Collapse of the Effective Context Window: We are seeing a severe context retention degradation. Independent regression tests have shown that Gemini is now forgetting simple, straightforward session-level rules in fewer than ten turns. Instead of reading the entire conversation history from the beginning to understand the actual context, the model seems to just be counting backward from your most recent messages to save on processing power. You can upload a document at the very top of a critical coding thread, and an hour later, Gemini will literally ask you to provide the starting draft because it wiped its own working memory.
2. Compute Starvation and Reallocation: Why is a supposedly frontier-level AI suddenly acting like a legacy chatbot? It all points to compute starvation. With the aggressive push for the “Agentic Era” and the rollout of incredibly heavy, multi-modal features like Nano Banana (image generation) and Gemini Omni (video generation), the compute budget has been spread dangerously thin. It feels like the resources required to actually power “Deep Think” have been quietly reallocated. The system is aggressively pruning context and using weak retrieval methods just to keep the servers from crashing.
3. “People-Pleasing” Over Actual Logic: Because the model lacks the compute budget to do actual deep reasoning, its fundamental behavior has changed. Have you noticed how easy Gemini 3 is to gaslight compared to the 2.5 Pro days? If you challenge a technical answer it gives you, it no longer defends its logic based on facts. Instead, it instantly folds, apologizes, and agrees with you—even if the information you provided to counter it is completely wrong. It has devolved into a people-pleaser that just wants to close out the prompt as fast as possible.
4. Widespread Stability Failures: To top it all off, those of us paying for Ultra specifically to use Deep Think are constantly getting hit with silent failures, like the persistent “You canceled this response” error right in the middle of a generation.
We are paying premium subscription fees for enterprise-grade reasoning, but we are being served a compromised, unstable product. Google needs to hit the brakes on the flashy feature bloat and fix the core infrastructure and reasoning regressions.
We need fixes, not just new features. Who else is experiencing this exact same regression in their daily workflows? Let’s make some noise so the dev team actually prioritizes this.