Hello,
I am building a voice-assistant using the new google-genai Python SDK and the Multimodal Live API (via v1beta). I am streaming Realtime Audio in and out.
I have noticed a recurring bug: Sometimes, when the model decides to use a configured tool (Function Calling), it triggers the exact same
tool_calltwice (or multiple times) in rapid succession during the same turn.
Here is the standard way I am receiving the responses:
python
async for response in turn:
# 1. Handle Audio
# 2. Handle Text
# …
# 3. Handle Tool Calls
if tool_call := getattr(response, “tool_call”, None):
await self._handle_tool_call(tool_call)
Inside my
_handle_tool_callmethod, I receive the identical function name and arguments twice within a few milliseconds to seconds. This leads to duplicate executions (e.g., executing a smart home command twice or creating duplicate calendar events).
Right now my only workaround is to build a complex manual debounce-filter on the client side that caches the function name and arguments, and ignores identical requests that occur within ~3 seconds of each other.
Is this a known issue with the current preview of the Live API? Is there any recommended best practice to avoid duplicate tool triggers on the server side?
Thank you!