Gemini 2.5 Pro sometimes outputs its internal thought process in the final response

I am using the Gemini 2.5 Pro API and have noticed that in some cases the model’s internal reasoning or “thought process” is included in the final output that is returned to the user.

From my understanding the model’s thinking should be done internally and only the final answer should appear in the output unless includeThoughts is explicitly enabled in the request. In my implementation I am not setting includeThoughts and I am only reading the main response.text yet the output occasionally contains what looks like step by step reasoning or meta commentary that should remain hidden.

I have seen other developers report similar issues in community forums so it seems to be a reproducible bug rather than an isolated case.

Could the team clarify under what circumstances this can happen and if there is a way to guarantee that internal thinking will never appear in the user visible output?

5 Likes

The same thing happens with gemini 2.5 flash, which is devastating for the user experience.

2 Likes

@sunmoon1 , @Sven_Yu

welcome to the forum ,

the thoughts are usually generated as output token for the model , this help to structure the thinking for the task associated. this should not result in thoughts in the output of the models. that said llms are not always deterministic best thing is to specify to not include any extra text like thoughts in the actual output

  1. is this issue persistent over multiple tries? also, can you give any prompts or cases where you were able to recreate it .

  2. have you tried any prompt engineering to ask to “avoid generatig thoughts and give me concise response”. this might be able to avoid