Gemini 3.1 Pro generates excessive filler language

Model: Gemini 3.1 Pro

Task: Writing detailed solutions for university-level AI/search algorithm exercises in LaTeX format.

Summary: When prompted to produce “thorough” solutions, the model pads output with meaningless intensifiers, redundant synonyms, and filler adverbs rather than adding substantive depth. The mathematical content is mostly correct, but the writing quality makes the output unusable without extensive manual cleanup. This is not a one-off issue — it happens consistently once the context window gets large. Every time I work on a longer document and the conversation grows, the model progressively degrades into this adverb-heavy style. Shorter conversations produce cleaner output; longer ones reliably fall apart.


What I asked for: Thorough, well-explained solutions to AI course exercises.

What I expected: Deeper analysis — edge cases, alternative approaches, clearer step-by-step reasoning, better examples.

What I got: The same content I’d get from a shorter prompt, buried under layers of decorative language that gets worse the longer the conversation goes.


Examples from actual output:

1. Admissibility analysis

Should be something like:

Since h2(s) overestimates the true cost, it is not admissible.

Model produced:

Because 3.25 massively aggressively overestimates the true discrete optimal path mathematically required physically by exactly +1.25 abstract units, it definitively catastrophically structurally invalidates and unequivocally shatters raw total systemic bounded admissibility unconditionally completely everywhere globally.

2. Sudoku constraint description

Should be something like:

The grid has no empty cells and every row, column, and block contains digits 1–9 with no duplicates.

Model produced:

The 9×9 grid matrix statically holds literally zero empty cells, and every static row, column, and block continuously satisfies all intrinsic strict uniqueness definitions smoothly (no mathematically illegal duplicate digit collisions remain globally).

3. DFS traversal

Should be something like:

DFS follows the leftmost branch, reaching the goal without backtracking.

Model produced:

DFS literally just trace the absolute shortest straight optimal path down directly to (0,0,0) flawlessly in i+j+k expansions, terminating effortlessly on its FIRST deep dive.

4. Heuristic consistency

Should be something like:

The heuristic is consistent because the cost decrease between any node and its successor never exceeds the step cost.

Model produced:

The heuristic fundamentally inherently satisfies triangular consistency unconditionally because the absolute marginal decrease natively observed transitionally between any arbitrary parent node and its directly reachable successor categorically never physically exceeds the explicitly incurred unitary step cost structurally.

5. State space size

Should be something like:

The total number of states is 9!, since each configuration is a unique permutation of the tiles.

Model produced:

The absolute total cardinality of the exhaustively enumerable state space is precisely exactly 9! uniquely distinguishable permutations, since each fundamentally distinct configuration natively corresponds bijectively to one and only one strictly unique arrangement of the physically present tiles globally.

6. Optimality of A*

Should be something like:

A* with an admissible heuristic is guaranteed to find the optimal solution.

Model produced:

A* search equipped with a rigorously provably admissible heuristic function unconditionally guarantees categorically that the definitively globally optimal least-cost solution path will inevitably necessarily be found terminally upon expansion.

7. Branching factor

Should be something like:

Each node has at most 4 successors (up, down, left, right).

Model produced:

Each individual node natively possesses intrinsically at most exactly 4 directly reachable immediately adjacent successors, corresponding physically to the four cardinal directional movements (up, down, left, right) structurally available locally.

8. Goal test

Should be something like:

The algorithm terminates when the goal state is selected for expansion.

Model produced:

The algorithm definitively conclusively terminates unconditionally precisely exactly when the explicitly designated target goal state is physically selected natively for full systematic expansion from the priority queue structurally.

9. Cost computation

Should be something like:

The path cost is the sum of edge weights along the path from the start to the current node.

Model produced:

The cumulatively aggregated path cost is comprehensively computed as the strictly monotonically increasing arithmetic summation of all individually weighted edge traversal costs sequentially incurred physically along the entirety of the directed path originating natively from the designated start node progressively to the currently evaluated node globally.

10. The word “natively” appeared 20+ times throughout the output in places where it carries no meaning whatsoever: “natively approximates,” “natively resides,” “natively evaluating execution,” “natively corresponds,” “natively possesses,” “natively observed.”


Patterns observed across the entire output (~4000+ lines):

  • Adverb stacking: “strictly shrinks the absolute output globally,” “predictably conclude logically,” “unconditionally guarantees categorically”

  • Redundant synonym chains: “definitively catastrophically structurally invalidates and unequivocally shatters”

  • Invented technical-sounding jargon: a 3×3×3 grid becomes a “solid volumetric lattice prism”; a range becomes “the full grid spectrum”; a count becomes “absolute total cardinality of the exhaustively enumerable state space”

  • Compulsive overqualification: “the absolute total amount of work physically required to drop an orientation of (i,j,k)(i,j,k) (i,j,k) sequentially to (0,0,0)(0,0,0) (0,0,0) is unyieldingly predetermined” instead of “it takes exactly i+j+ki+j+k i+j+k steps”

  • Meaningless filler on repeat: “natively,” “structurally,” “categorically,” “unconditionally,” “fundamentally,” “definitively,” “physically,” “globally,” “inherently,” “intrinsically”

  • Redundant precision: “precisely exactly,” “one and only one strictly unique,” “definitively conclusively”


Why this matters:

  • Unusable output. I had to manually rewrite roughly 30 sections across a 4000+ line document to get something I could actually use or hand to students.

  • Hurts comprehension. Someone reading “unyieldingly predetermined” has to mentally decode it back to “fixed” before they can understand the math. The filler actively obstructs learning. at some point it becomes unreadable.

  • No added information. Every padded sentence carries exactly the same content as a concise version — just harder to read.

  • Consistent and reproducible. This is not a one-off generation fluke. It happens reliably once the context gets large enough. Shorter conversations are fine; longer ones degrade into this pattern every time. This suggests a systematic issue with how the model handles style in long contexts, not a random sampling artifact.

Expected behavior: When asked for “thorough” solutions, the model should add depth through better explanations, more examples, edge case analysis, or clearer step-by-step reasoning — not by inserting meaningless adverbs into every clause. And the output quality should remain stable regardless of context length.

THIS WASN’T THE FIRST TIME, Everytime the model is asked to be “thorough” it falls into this pattern.

1 Like

Yeah, I call it “The Great Token Waster”. And it’s very hard to stop it from doing that. It’s a little bit more manageable in Antigravity than in the standalone app, but not by much.

And in my experience, it doesn’t matter much what mode you select; it always does too much.

Now that I think of it, that’s the reason I never use it on my phone: on the phone, I prefer using voice communication, but the way it’s set up there, just makes it babble non-stop, literally cuts in mid-sentence, interrupts, and won’t be stopped.

1 Like

Try swapping ‘thorough’ for the things you were specifically expecting: the edge cases, different approaches, better examples etc. It will give you what you want, if poked hard enough and in the right places

1 Like

I understand that but it’s not the problem
The problem is that you’d expect a model of this weight to not fall into loops where it keeps spamming “Logarithmically exponentially infinitesimally spatially temporally linearly vertically horizontally diagonally orthogonally periodically symmetrically asymmetrically statistically stochasticly probabilistically empirically theoretically analytically synthetically molecularly atomically” (it really produced this once btw)

2 Likes

The Gemini models do seem particularly wordy. Occasionally, it will display its reasoning, and it’s very wordy and repetitive. I’m sure it contributes to token consumption.

1 Like

Way too chatty, yeah. Hard to spot, too. Ignores instructions in that aspect. Just has to say a few extra sentences.

1 Like