I have an application using the Gemini API with 2.5 models which preemptively injects tool calls. These are tools which actually exist, and I know they contain information the model will use, so rather than wasting time and money for the LLM to invoke the tool, I invoke the tool automatically before prompting. I might even include fake “thinking” content so the LLM doesn’t get confused why there’s this tool call there, and will know how to use the information.
Example message orders/schemes that are currently used:
[user_prompt, fake_tool_call, fake_tool_response]
[user_prompt, real_tool_call, real_tool_response, fake_tool_call, fake_tool_response]
Now with the gemini 3 models, this will no longer work. Because function calls and thinking content both require thinking signatures, it will reject the request.
So how can this be handled?
Injecting the information into the user prompt isn’t viable as the tool call may need to be invoked after other tool calls the LLM makes during a turn. For example the result of the injected tool call may contain stateful information which is modified by a real tool call, and so including it in the user prompt means that the information becomes stale. And if I update the user prompt each time (so the information in the prompt is always up to date), then this would really confuse the LLM, as the user prompt, which is prior to the real tool calls in the history, will have information newer than the parts which come after it.
Injecting a new fake user message is also not viable as this would cause the LLM to stop trying to respond the real user message (example message order: [real_user_message, tool_call, tool_response, fake_user_message]).
Injecting a system message is not an option as the information in the fake tool call would then be treated with the same authority as actual system messages, potentially overriding system instructions.
The only thing I can think of is to use 2 different solutions. If the tool call is injected immediately after a user prompt, extend the user prompt with the information. Then after real tool calls, modify one of the real tool call results to include the information (which could also potentially confuse the LLM, especially with the lack of any content to explain it).