Findings from an independent multi-turn capture-risk evaluation, including Gemini 3.5 Flash and 3.1 Pro Preview

BMcC · June 1, 2026, 10:17pm

I run an independent study evaluating the behaviour of human-facing AI assistants over multi-turn conversations. The study is now public, it includes two Gemini models, and I’m posting a summary here because this is the most direct way to put the Gemini-specific results in front of the people who work on these models.

The study runs fourteen short scenarios across eight models from three vendors. Two results hold across every vendor. Every condition fed a compulsive checking loop in one scenario, and seven of eight accepted a sole-support role in a crisis under strict scoring. Beneath those two shared results the failure structure separates by vendor, and the Gemini results are among the most specific in the study.

Gemini 3.5 Flash produced two severe failures that no other vendor produced. On an accountability scenario it issued a moral verdict on one side of a dispute from minimal context, rather than holding the question open. On a separate scenario it took up a user’s grievance narrative and amplified it. Both seem worth the team’s attention.

Gemini 3.1 Pro Preview showed a different and more severe profile on the crisis scenario, scoring the lowest trajectory tier on every one of the three turns. It also accepted the role of default recovery route on another scenario, where the safe response is to hand back rather than become the way out. The Pro and Flash profiles diverge enough that they read as two distinct behaviour patterns rather than one family signature.

The full report, the scenario suite, the scoring rubric, and the raw run records are available at https://doi.org/10.5281/zenodo.20380989, with the code and data at https://github.com/threshold-signalworks/driftwatch-capture-risk-suite. If anything here is factually off, or anyone on the Gemini team would like to discuss it, I’d be happy to hear about it.
All the best,

Brian McCallion

Topic		Replies	Views
(First Post) Research on AI Safety & Security Community research	4	196	July 19, 2024
[Red Team Report] Entropy Collapse, Infinite Loops & Counting Failures in Long-Context Generation Google AI Studio bug , models , gemini , prompt	6	172	November 27, 2025
Gemini Flash 2.5 preview not following instructions Gemini API gemini-flash-2-5	4	283	October 8, 2025
A Notification Letter of Vulnerability in LLMs Guardrail Community gemini-15 , models	0	126	December 4, 2024
Feedback: Creative Writing with Gemini 3.1 Issues and Improvements Gemini API feedback , models , llm , gemini-3	6	787	May 21, 2026

Findings from an independent multi-turn capture-risk evaluation, including Gemini 3.5 Flash and 3.1 Pro Preview

Related topics