Hello,
I have recently started playing around with and testing Gemini models for my project with the free tier.
I have worked with gemini-2.5-flash and wanted to test gemini-2.5-pro.
I have a ~200 lines of system prompt with instructions styled in XML-like sections.
I have two tools with parallel tool calling enabled. In the instructions, the model is instructed to call these tools multiple times as much as it needs in the same response.
It works in gemini-2.5-flash (also tested on claude-sonnet-4 and it works), but when trying with gemini-2.5-pro, I get only one tool call in the response, or even worse, sometimes I get MALFORMED_FUNCTION_CALL.
When testing gemini-2.5-pro on the example provided in the documentation (Function calling with the Gemini API | Google AI for Developers) but with a similar config to mine (mode “AUTO” and with a system prompt), it seems to work fine and call all these mock tools together.
So, it seems to me that the issue is that my system prompt is too long for the model to call multiple tools at once? I have played around with max_output_tokens without any luck…
Maybe I am missing something? Anyone else with a similar issue?
Thank you