There are a ton of use cases where thinking is just counterproductive. I don’t need the model to think for X words for a fast-paced dialogue for example. Are you ever going to release a SOTA model with the option to turn thinking off?
Thanks for your feedback @Amit_Tzah ,
While we don’t have an option to turn off thinking mode
for the 2.5 model, our new gemini-2.5-pro-preview-06-05
model allows you to configure the “thinking budget” from 128-32768 tokens, reducing latency for faster responses.
We appreciate your suggestion and will consider a feature to turn “thinking mode” off for future releases.