Gemini 2.5 Pro often not closing thoughts (05-06 does work correctly)

I am using Gemini in a setting where it’s extensively interacting with external tools. Essentially the process I’m looping is:

  1. User sends a message
  2. Gemini thinks, and can choose to execute one or more function calls, alternatively it can send a message (and not call functions)
  3. If functions were called, Gemini receives the results, thinks, and writes a message to the user that synthesizes the results from the function calls

This process is explained to Gemini explicitly in the system prompt and it generally abides by that.

However, in step 3, the thinking stage does not seem to be cleanly delineated. I often receive messages from the API that open with a <thought> tag but never close it (even though I can tell clearly from the content of the message that there is a transition from 1st person thinking mode to a mode where it addresses the user).

I’m using the OpenAI compatible API.

In step 2, I’m making a request like:

                first_response = client.chat.completions.create(
                    model="gemini-2.5-pro",
                    messages=cast(List[ChatCompletionMessageParam], messages_to_send),
                    tools=cast(List[ChatCompletionToolParam], TOOLS),
                    extra_body={
                        "extra_body": {
                            "google": {"thinking_config": {"include_thoughts": True}}
                        }
                    },
                )

and in step 3:

                    final_response = client.chat.completions.create(
                        model="gemini-2.5-pro",
                        messages=cast(
                            List[ChatCompletionMessageParam], messages_to_send
                        ),
                        tools=[],
                        extra_body={
                            "extra_body": {
                                "google": {
                                    "thinking_config": {"include_thoughts": True}
                                }
                            }
                        },
                    )

Relevant prequel:

from openai import APIConnectionError, OpenAI
from openai.types.chat import (
    ChatCompletionMessageParam,
    ChatCompletionToolParam,
)

client = OpenAI(
    api_key=api_key,
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/",
)

The 05-06 model does not have these same issues so I am quite concerned about it being deprecated, because this is causing real issues for my users’ experience.

1 Like

It might be worth noting that the thought leakage still seems to occur even if I don’t set "include_thoughts": True in the "thinking_config" for step 3

1 Like

Hey @Sam_Van_Herwaarden , Thanks for flagging this. I was able to reproduce the issue and encountered the same problem you described. I’m escalating it to the engineering team for further analysis.