Why does Gemini 3 Flash return sequential single tool calls instead of batching independent ones?

maxi2020 · April 14, 2026, 10:08am

Hey team,

We’ve been running an agentic system on Gemini 3 Flash Preview and wanted to ask about parallel function calling
behavior.

We see from the docs that Gemini supports returning multiple function calls in a single response when they’re
independent. Our framework parses these correctly — if the model returns multiple functionCall parts, we handle them
all and return results. We also explicitly set toolConfig.functionCallingConfig.mode: “AUTO” on every request. But in
practice, Gemini 3 Flash Preview consistently returns one function call per response, even when the calls are clearly
independent.

This adds up fast. A typical turn where the agent reads a couple files and saves a note turns into 4-5 sequential API
round-trips, each re-sending the full context. On a ~14K token context, that’s 56K-70K input tokens for what could
have been 1-2 round-trips with parallel calls.

We think we may have found a related cause. There’s an existing report about Gemini 3 Flash Preview inconsistently
generating thought_signature fields for parallel function calls, which causes 400 errors and potential silent data
loss: [Gemini 3 Flash Preview] Inconsistent thought_signature generation in parallel function calls causes 400 errors and potential silent data loss. If the model is aware that parallel calls trigger
signature issues, it may have learned to avoid generating them entirely.

What we’ve verified on our end:

functionCallingConfig.mode is explicitly set to AUTO on every request
Our response parser correctly handles multiple functionCall parts (unique IDs, thought_signature passthrough)
The behavior is consistent across hundreds of turns — zero parallel calls observed

Questions:

Is the thought_signature bug causing the model to avoid parallel function calls? If so, is there a timeline for a fix?
Is there anything else in the request format that encourages batching? System instruction hints, toolConfig options we’re missing?
Do the Pro models or Gemini 2.5 series parallelize more aggressively, or is this a general limitation?

The cost impact is significant — 3-4x more input tokens than necessary on every tool-using turn. Any guidance would be
huge.

Thanks!
Max

Topic		Replies	Views
Gemini 2.5 pro calling only one function at a time Gemini API api , gemini_25_pro , gemini-flash-2-5	1	211	August 4, 2025
Problems With gemini-2.0-flash Tool Calling Gemini API gemini-flash , gemini-20	13	1952	June 19, 2025
What is the recommended way of dealing with multiple functionCall requests? Gemini API api , models	1	150	May 14, 2025
Gemini OpenAI Compatibility: Multiple Functions Support in Function Calling - Error 400 Gemini API gemini	6	765	February 10, 2025
[Gemini 3 Flash Preview] Inconsistent thought_signature generation in parallel function calls causes 400 errors and potential silent data loss Gemini API gemini , gemini-3	2	1262	February 9, 2026

Why does Gemini 3 Flash return sequential single tool calls instead of batching independent ones?

Related topics