Bug Report: Gemini 3.1 Pro — PDF not processed in subsequent turns of multi-turn conversation

Forum: discuss.ai.google.dev
Category: Gemini API / Gems
Priority: High
Date: February 2026


Summary

Gemini 3.1 Pro fails to apply OCR/document processing on PDFs attached in turns 2+ of a multi-turn conversation(Gems). The model detects the file metadata but does not read its content, responding as if the document is inaccessible. This regression is specific to Gemini 3.1 Pro — the same workflow works correctly with Gemini 3.0 Flash and Gemini 3.0 (reasoning model).


Environment

Field Value
Model affected gemini-3.1-pro-preview
Models NOT affected gemini-3.0-flash, gemini-3.0 (reasoning)
Interface Google Gems (consumer + Workspace)
PDF source FootyStats, Sofascore, ValueStats (statistical data PDFs)
PDF size ~5-15 pages per document
Workflow Sequential multi-phase pipeline (3 turns, one PDF per turn)

Steps to reproduce

  1. Create a Gem with a system prompt that defines a multi-turn pipeline:

    • Turn 1: User attaches PDF_A → model reads and stores data

    • Turn 2: User attaches PDF_B → model reads and stores data

    • Turn 3: User sends text + PDF_C → model executes full analysis

  2. Start a new conversation with that Gem using Gemini 3.1 Pro.

  3. Turn 1: Attach PDF_A with instruction “Read this document and confirm the team name, competition, and one numerical value (e.g. PPG).” → Model reads correctly :white_check_mark:

  4. Turn 2: Attach PDF_B (different document, same structure) with identical instruction. → Model detects the file exists but cannot extract any content :cross_mark:


Actual behavior (Gemini 3.1 Pro)

In Turn 2, the model returns a response similar to:

“PDF [filename] not accessible in a native way. I have detected the attached file in the system metadata, but its text content has not been provided in the prompt for algorithmic reading (unlike the extraction I performed for [PDF from Turn 1]).”

The model acknowledges the file exists but treats it as unreadable. It correctly processed the Turn 1 PDF but fails on Turn 2 onward — even when the second PDF is structurally identical to the first.


Expected behavior

The model should apply the same document processing (OCR + native text extraction) to PDFs attached in any turn of a multi-turn conversation, not only the first turn. This is the documented behavior and it works correctly in Gemini 3.0 Flash and Gemini 3.0 reasoning.


Comparison across models

Model Turn 1 PDF Turn 2 PDF Turn 3 PDF
Gemini 3.0 Flash :white_check_mark: Reads correctly :white_check_mark: Reads correctly :white_check_mark: Reads correctly
Gemini 3.0 (reasoning) :white_check_mark: Reads correctly :white_check_mark: Reads correctly :white_check_mark: Reads correctly
Gemini 3.1 Pro :white_check_mark: Reads correctly :cross_mark: Detects file, no content :cross_mark: Detects file, no content

The regression is exclusive to Gemini 3.1 Pro and was not present in Gemini 3.0 Pro.


Hypothesis

Based on the documented stateless nature of the Gemini API and the architectural changes introduced in Gemini 3.1 Pro (3-level thinking system, updated attention mechanism), it appears that:

  1. The model’s attention budget is exhausted after processing the first PDF in Turn 1.

  2. In subsequent turns, the model detects the file in the request metadata but does not trigger active OCR/visual rendering for the new document.

  3. This may be related to how the thinking mode in 3.1 Pro allocates context resources differently from 3.0 Flash, which does not exhibit this behavior.

This is consistent with the documented “lost in the middle” phenomenon in long-context models, where documents in intermediate positions receive degraded attention.


Impact

This bug breaks any multi-phase pipeline that relies on uploading different PDFs in sequential turns — a common and legitimate use case for document analysis workflows, research pipelines, and data processing Gems. Users cannot currently use Gemini 3.1 Pro for any workflow requiring more than one PDF across multiple turns.


Workaround (partial)

Using Gemini 3.0 Flash instead of Gemini 3.1 Pro resolves the issue, but this is not a sustainable solution as 3.0 models will eventually be deprecated. A server-side fix in Gemini 3.1 Pro’s document processing for multi-turn contexts is needed.

A prompting-side mitigation (forcing the model to extract specific data points from the PDF before storing it — “priming”) reduces the failure rate but does not eliminate it entirely with 3.1 Pro.


Request

Please investigate whether the PDF document processing pipeline in Gemini 3.1 Pro correctly handles documents attached in turns 2+ of a multi-turn conversation, and whether the attention/context allocation changes in 3.1 Pro affect the activation of OCR for non-primary context positions.


Reported by a developer using Gems for multi-phase quantitative analysis pipelines.