How much thinking is enough for agentic applications

I am wondering about the optimal thinking budget when building an agent that runs on gemini 2.5 pro. In my case its a simple ReAct agent with only a couple of tools.

I wonder if the thinking budget can be optimized both for cost and speed. Any input on how to set the thinking budget would be great to get.

I have tried -1 as a dynamic budget value, which works fine but I don’t have a good intuition on what the benefits are for that.

We could run evals with different settings, but I was wondering if there are any heuristics or general advice that we could help getting started with this.

Hello,

Welcome to the Forum!

You may need to fine tune your prompt and based on your use case, experiment to determine the optimal thinking budget.

While increasing the thinking budget can improve model performance, it may also increase response time and cost as thinking tokens are part of the overall prediction. Therefore, it is recommended to choose a thinking budget carefully tailored to your specific use case.

2 Likes