Hi,
I’m building a voice assistant using the Gemini Live API (BidiGenerateContent) and encountering an issue with NON_BLOCKING functions and the scheduling: “SILENT” parameter.
Setup:
-
Model: gemini-2.5-flash-native-audio-preview-12-2015
-
Function definition with “behavior”: “NON_BLOCKING”:
{
“name”: “play_animation”,
“description”: “Play a robot animation”,
“behavior”: “NON_BLOCKING”,
“parameters”: { … }
}
Scenario:
-
User says: “Hello”
-
Model responds with “Hello! What can I help you with today?” and simultaneously triggers a play_animation function call (to wave hello)
-
We send the function response with scheduling: “SILENT”:
{
“toolResponse”: {
"functionResponses": \[{ "id": "...", "name": "play_animation", "response": { "result": {"status": "started"}, "scheduling": "SILENT" } }\]}
}
-
Model speaks “Hello! What can I help you with today?” again
Result:
The same response is spoken twice, and both audio streams arrive within the same turnComplete. The transcript shows:
“Hello! What can I help you with today?Hello! What can I help you with today?”
Timeline from logs:
13:20:31.153 - Audio chunks start (“Hello! What can I help you with today?”)
13:20:31.553 - toolCall received (play_animation)
13:20:31.573 - Sent toolResponse with scheduling: “SILENT”
\[\~1.5 second gap\]
13:20:33.069 - More audio chunks arrive (same message repeated!)
13:20:34.795 - turnComplete received
Expected Behavior:
With scheduling: “SILENT”, the model should silently acknowledge the function result without generating any follow-up audio. The first “Hello! What can I help you with today?” should be the only response.
Actual Behavior:
The model generates the same audio response twice, suggesting it either:
-
Ignores scheduling: “SILENT” for NON_BLOCKING functions
-
Has already queued/generated the second response before receiving our SILENT response
Question:
Is this expected behavior? How can I prevent the model from generating duplicate audio after a NON_BLOCKING function call when the initial response already contains the intended message?
Environment:
-
WebSocket API: wss://generativelanguage.googleapis.com/ws/google.ai.generativelanguage.v1alpha.GenerativeService.BidiGenerateContent
-
Platform: Android (Kotlin)
Thank you for any guidance!