How to Reduce Thought Reasoning in Gemini 2.5 Pro

Jongmin_Oh · May 9, 2025, 3:23am

Hi everyone,

I’m working with Gemini 2.5 Pro preview and trying to find a way to reduce or disable Thought reasoning. This has become increasingly important in my use case for the following reasons:

Why I want to reduce Thought reasoning:

Empty string outputs when max_output_tokens is set
When the model generates too many internal Thought tokens, it sometimes exhausts the token limit before producing a final answer, resulting in an empty or null output.
Uncontrollable response latency
Since we can’t control the depth or length of the Thought process, the model’s response time becomes unpredictable, which is problematic for latency-sensitive applications.

My questions:

Is there any way to reduce or turn off Thought reasoning via a system prompt?
For example, can we use something like "Do not use Thought reasoning" in the system message or other configuration-based controls?
Does Google have any plan to give developers more control over Thought generation in future Gemini versions or API updates?

Thanks in advance for any insights!
Would love to hear from anyone who has experimented with prompt-level or system-level workarounds.

GUNAND_MAYANGLAMBAM · May 9, 2025, 4:23am

Hi @Jongmin_Oh , Welcome to the forum.

As far as I understand, you can’t disable thought reasoning in the 2.5-pro model. If you want to control thought reasoning, you can opt for the 2.5-flash model, where it can be managed using the thinkingBudget parameter.

Jongmin_Oh · May 9, 2025, 4:54am

Thank you for your response. I also find it unfortunate that Thought reasoning cannot be disabled in the 2.5 Pro model.
The reason we set max_output_tokens is to allow some level of predictability over the output length and cost. However, since the tokens used for internal Thought reasoning are also counted, it’s difficult to accurately estimate the final output and associated cost.

What’s even more frustrating is when the model returns an empty string — it feels like we’re paying for tokens without getting any usable output.
While I’m genuinely impressed by the model’s performance, I hope these issues will be improved in future updates.

sobir.bobiev · May 11, 2025, 11:02pm

Here is a partial workaround that sometimes reduces thinking steps and sometimes skips thinking completely.

Append something this to your prompt:

SELF_TALK: off
REASONING: off
THINKING: off
PLANNING: off

Reply immediately without thinking or any effort. Prioritize speed over accuracy. Do not state what the user said. Do not think, analyze or plan - go with your gut feeling.

sobir.bobiev · May 20, 2025, 11:57pm

Update:

Effective is also including:
THINKING_BUDGET: < 10 words

rx_adisu · May 22, 2025, 8:51pm

As of May 22nd, I don’t believe these extenders work that well anymore

GUNAND_MAYANGLAMBAM · June 6, 2025, 1:12pm

Hey, just wanted to let you know that the updated gemini-2.5-pro-preview-06-05 includes the functionality to configure the ‘thinking_budget’ parameter.

Thanks!

Jongmin_Oh · June 9, 2025, 7:41am

I’m really excited about this update. In actual testing, setting the thinking_budget to at least 128 reduced the response time to a quarter,
and now I can even predict the number of tokens in advance to estimate usage costs.

Huge thanks to Google—sincerely appreciate the quick resolution!

Topic		Replies	Views
Gemini-2.5-flash-preview-04-17 not honoring thinking_budget=0 Gemini API help_request	5	1453	April 22, 2025
Are there any plans to ever release a pro model with the option to turn thinking off? Gemini API model , thinking	1	86	June 6, 2025
Gemini 2.5 Flash Thinking Tokens using OpenAI API Gemini API help_request	16	1290	June 12, 2025
How To disable Thinking using Gemini 2.5 Flash? thinkingBudget: 0 not working Gemini API help_request , gemini-flash	1	1566	April 23, 2025
Gemini 2.5 Pro, Thinking and Non-thinking Google AI Studio models , gemini-20	6	2496	June 19, 2025

How to Reduce Thought Reasoning in Gemini 2.5 Pro

Why I want to reduce Thought reasoning:

My questions:

Related topics