How much thinking is enough for agentic applications

yellowcap · October 10, 2025, 11:13am

I am wondering about the optimal thinking budget when building an agent that runs on gemini 2.5 pro. In my case its a simple ReAct agent with only a couple of tools.

I wonder if the thinking budget can be optimized both for cost and speed. Any input on how to set the thinking budget would be great to get.

I have tried -1 as a dynamic budget value, which works fine but I don’t have a good intuition on what the benefits are for that.

We could run evals with different settings, but I was wondering if there are any heuristics or general advice that we could help getting started with this.

Lalit_Kumar · October 15, 2025, 11:25am

Hello,

Welcome to the Forum!

You may need to fine tune your prompt and based on your use case, experiment to determine the optimal thinking budget.

While increasing the thinking budget can improve model performance, it may also increase response time and cost as thinking tokens are part of the overall prediction. Therefore, it is recommended to choose a thinking budget carefully tailored to your specific use case.

Topic		Replies	Views
Did the Gemini 2.5 Pro max thinking budget change? Gemini API models , gemini , thinking	1	207	July 1, 2025
How to Reduce Thought Reasoning in Gemini 2.5 Pro Gemini API api , models	7	2169	June 9, 2025
Regarding the use of thought token budget: Gemini API billing , rate-limits	1	126	May 1, 2025
Gemini 2.5 Flash Overthinking by a lot Gemini API prompt , gemini-2	6	370	September 5, 2025
Pricing for Gemini 2.5 API: With and Without Thinking Option in the Official Release Gemini API billing , thinking , gemini-2-5	5	502	July 18, 2025

How much thinking is enough for agentic applications

Related topics