Gemini 2.5 Pro Exposing "Silent Thought" Process in Long Context Conversations

Hello everyone,

Our team is developing a virtual character chat product, and we are currently using the Gemini 2.5 Pro model.

We chose the Pro model because, although it is generally recommended for reasoning tasks rather than role-playing, our use case involves numerous and strict constraints for the virtual character. In our tests, only the Pro model, with its robust long-context capabilities, can consistently adhere to these guidelines.

To implement short-term memory for the character, we manually control and manage the conversation context, currently setting it to a length of approximately 150 conversation turns.

The core issue we are encountering is that when the number of conversation turns with an end-user grows and approaches our 150-turn limit, the model frequently includes its internal “silent thought” process in the final answer before providing the actual, in-character response that we need. This thinking process should not be visible to the user.

Notably, we have never observed a similar issue in shorter user sessions.

Therefore, I’d like to ask for your insights:

Is this phenomenon likely caused by the context window becoming too long, leading the LLM to fail to strictly adhere to the instructions and character constraints we have defined in the System Prompt (SP)? 6 Or are there other potential causes we might be overlooking?

Has anyone else encountered a similar situation, or do you have any suggestions for potential solutions or debugging strategies?

Thank you very much for your help

I can only assume that this is because somewhere there are thoughts of the character that and decided to repeat, perhaps in the hint there is an example or instruction how to think

From experience, I wanted to influence the thought process by including an example, but it led to reflections in the answer itself, gemini 2.5 pro