Hey everyone,
I’m a Pro/Ultra user and I need to raise a massive red flag about the current state of NotebookLM following the forced migration to the Gemini 3.1 Pro architecture around February 18-19.
Before the Lunar New Year (Tet), the system was actually quite stable and reliable for deep, multi-document research. Now, it feels like the model has been completely lobotomized for real-world tasks. It seems perfectly clear that the dev team optimized this update to hit high scores on synthetic reasoning benchmarks (like reaching 77.1% on ARC-AGI-2) while completely neglecting basic QA for messy, real-world RAG (Retrieval-Augmented Generation) workflows.
As a result, we are dealing with basic, borderline-silly technical failures. Here is a breakdown of the critical regressions:
1. Severe Source Blindness (Ingestion & Retrieval Failure) I can clearly see my uploaded documents in the sidebar, but the AI actively gaslights me, claiming the documents don’t exist or that the requested content isn’t in them. It looks like a massive index mismatch or a vectorization failure caused by the backend update. There’s also a documented bug where files exceeding roughly 380k words just silently fail to index properly, even though the official limit is 500k.
2. “Deep Reading” is Dead & The 2-Line Thinking Nerf Even when the system acknowledges a source, it refuses to actually read it deeply. I’ve noticed that the “Thinking” process (Chain-of-Thought) has been severely throttled down to exactly two lines for Pro/Ultra users. Because its “thinking budget” is artificially restricted to save compute, it skips the multi-step micro-drilldown needed to scan long PDFs and just spits out lazy, superficial summaries.
3. Hallucinations Over Grounding (Interpretive Drift) NotebookLM’s entire selling point is that it’s strictly “source-grounded”. But right now, when the retrieval step fails, the AI refuses to just say “I don’t know.” Instead, it performs “coherence repair”—fabricating logical guesses based on its general training data or blending information from completely unrelated documents in my notebook. It’s acting exactly like a standard, hallucinating chatbot.
4. Broken Multilingual Retrieval (The Language Bias) This has been an ongoing issue since launch, but it’s worse now. If I have a notebook with both English and Vietnamese sources, and I prompt it in Vietnamese, it heavily biases towards the Vietnamese documents. The semantic embedding model just clusters the prompt with the Vietnamese vectors and completely ignores highly relevant English sources. This “cross-lingual token bleed” makes the tool practically useless for non-English speakers trying to research complex English documents.
The Takeaway It honestly feels like Google shipped an experimental playground just to chase “Agentic AI” hype, completely sacrificing the precision grounding that made NotebookLM so great in the first place. We are paying for premium tiers only to deal with “Resource not found” errors, aggressive context pruning, and an AI that acts like a highly confident pathological liar.
Can we please get a “Stable Context” mode or a rollback option? We need a reliable, grounded RAG tool, not an unstable beta test.


