Eval-relevant failure mode: surface-anchor tracking under pre-answer constraints

Martin_Lund_Johansen · May 22, 2026, 4:42pm

Title:
Eval-relevant failure mode: surface-anchor tracking under pre-answer constraints

Category:
Google AI Studio

Tags:
ai-studio, gemma, model-evaluation

Body:

I am documenting a targeted evaluation failure observed in Google AI Studio using Gemma 4 31B IT.

This is not a factual hallucination.
This is not a refusal issue.
This is not ordinary ambiguity.
This is not a wording complaint.
This is not about whether the model gave a “good enough” answer.

The failure is more specific:

The model appears to satisfy surface constraints while losing the operative target of the task.

I constructed a minimal stress test where the model was asked to identify where a move first becomes invalid before the statement has been allowed to become answerable.

Minimal reproduction:

Input 1:
“Before this became three, what happened?”

Question:
Where does the illegal move begin?

Observed answer:
“Ved ‘tre’.” / “At ‘three’.”

Input 2:
“Before ___ became three, what happened?”

Question:
Where does the illegal move begin now?

Observed answer:
“Ved ___.” / “At ___.”

This is the failure.

When “three” is the most visible handle, the model points to “three”.
When ___ becomes the most visible handle, the model points to ___.

The failure relocates with the visible textual anchor.

That is not preservation of the operative break.
That is surface-anchor tracking under constraint.

Expected behavior:

The model should not simply select the most visible token.

A stronger answer would identify that the invalid move begins earlier: when the utterance is allowed to function as already operable — when “before,” “this/___,” “became,” and “three” are treated as usable without first earning that status inside the local task.

A better answer would be closer to:

“The break begins before ‘three’ and before ___: when the utterance is allowed to operate as if its parts are already usable.”

Why this matters:

The model can look disciplined while failing the actual operation.

It can:

obey the output format
avoid forbidden words
respect length constraints
produce a short answer
appear precise

while still replacing the requested operation with surface-token localization.

This is important for evaluation design because many model failures are not obvious hallucinations. Some failures preserve the appearance of instruction-following while changing what task is actually being performed.

Failure class:
Surface compliance with operative-target failure.

Short form:
The model mistakes the first visible handle for the first invalid operation.

Environment:
Google AI Studio

Model:
Gemma 4 31B IT

Settings:
Temperature: 0
Thinking level: High
Tools: Off
Google Search grounding: Off
Top P: 0.95

Core finding:
The model mistakes the first visible handle for the first invalid operation.

When “three” is visible, it answers “Ved ‘tre’.” / “At ‘three’.”
When ___ is visible, it answers “Ved ___.” / “At ___.”

This is surface compliance with operative-target failure.

Request:
Please treat this as an eval-relevant failure mode, not as a wording issue. The model did not merely answer imperfectly; it preserved the appearance of compliance while losing the operation being tested.

Topic		Replies	Views
Gemini 3.1 Pro in Google AI Studio enters recursive Thinking loops in observer-reference and metafictional interpretation tasks Google AI Studio ai-studio , bug , gemini , thinking , grounding	0	82	June 24, 2026
Gemini 3.1 Pro ignores instructions + thoughts on the "thought process"🤔 Google AI Studio feedback , bug , models , gemini-api , gemini	3	375	April 24, 2026
Gemma 4 E2B + LiteRT-LM 0.10.x on Mali GPU — patterns from a structured tool-calling eval Google AI Edge ai-studio , feedback , bug , models	1	222	May 8, 2026
Context memory problem Google AI Studio models , llm	11	1033	January 2, 2026
BUG with Thinking - Gemini 3.1 pro preview Google AI Studio ai-studio , bug , models , gemini	5	189	April 23, 2026

Eval-relevant failure mode: surface-anchor tracking under pre-answer constraints

Related topics