Big Problem! Gemini 3.0 pro preview thought token exceeding problem

Hi!
Sometimes, when model is set to use a low thinking_level or thinking_budget to 128 or 256, it unexpectedly uses around 3,000 thought tokens, even though the task is almost identical to others. This happens with both temperature=1.0 and temperature=0.0, and it significantly affects API costs. I’d really appreciate it if this could be fixed quickly. Thank you.

For now, I solved this by adding system prompt “Please don’t think too much!”
What I experimented - temperature, top_p, thinking budget to fairly large 512 tokens - didn’t work..

and.. It also fails often. I really need some help here.

Hi @komin , Thank you for bringing this to our attention. Could you please provide an example of the prompt you are using with Gemini 3? This will help us diagnose the issue.