Scheduling: "SILENT" in NON_BLOCKING function response not preventing duplicate audio generation

Erich · January 7, 2026, 12:56pm

Hi,

I’m building a voice assistant using the Gemini Live API (BidiGenerateContent) and encountering an issue with NON_BLOCKING functions and the scheduling: “SILENT” parameter.

Setup:

Model: gemini-2.5-flash-native-audio-preview-12-2015
Function definition with “behavior”: “NON_BLOCKING”:

{

“name”: “play_animation”,

“description”: “Play a robot animation”,

“behavior”: “NON_BLOCKING”,

“parameters”: { … }

}

Scenario:

User says: “Hello”
Model responds with “Hello! What can I help you with today?” and simultaneously triggers a play_animation function call (to wave hello)

We send the function response with scheduling: “SILENT”:

{

“toolResponse”: {

"functionResponses": \[{

  "id": "...",

  "name": "play_animation",

  "response": {

    "result": {"status": "started"},

    "scheduling": "SILENT"

  }

}\]

}

Model speaks “Hello! What can I help you with today?” again

Result:

The same response is spoken twice, and both audio streams arrive within the same turnComplete. The transcript shows:

“Hello! What can I help you with today?Hello! What can I help you with today?”

Timeline from logs:

13:20:31.153 - Audio chunks start (“Hello! What can I help you with today?”)

13:20:31.553 - toolCall received (play_animation)

13:20:31.573 - Sent toolResponse with scheduling: “SILENT”

          \[\~1.5 second gap\]

13:20:33.069 - More audio chunks arrive (same message repeated!)

13:20:34.795 - turnComplete received

Expected Behavior:

With scheduling: “SILENT”, the model should silently acknowledge the function result without generating any follow-up audio. The first “Hello! What can I help you with today?” should be the only response.

Actual Behavior:

The model generates the same audio response twice, suggesting it either:

Ignores scheduling: “SILENT” for NON_BLOCKING functions
Has already queued/generated the second response before receiving our SILENT response

Question:

Is this expected behavior? How can I prevent the model from generating duplicate audio after a NON_BLOCKING function call when the initial response already contains the intended message?

Environment:

WebSocket API: wss://generativelanguage.googleapis.com/ws/google.ai.generativelanguage.v1alpha.GenerativeService.BidiGenerateContent
Platform: Android (Kotlin)

Thank you for any guidance!

Erich · January 7, 2026, 1:04pm

I conducted another test where I completely skipped sending the tool response for the play_animation function to see if the duplicate audio was triggered by our tool response.

Result: The duplicate audio was still generated, even without any tool response being sent.

Log evidence:

13:58:01.117 Google: Tool call - play_animation

13:58:01.142 [Our app] Skipping tool response (no sendGoogleToolResult called)

…

13:58:02.751 Google: Audio chunk received, 46080 bytes ← Second response starts

…

13:58:05.766 Google output: “Hello! It’s great to meet you. How can I help you today?Hello! It’s great to meet you. How can I help you today?”

Conclusion: This confirms the duplicate audio is generated automatically by the Live API for NON_BLOCKING functions, regardless of whether:

We send a tool response with scheduling: “SILENT”
We send a tool response without scheduling
We don’t send any tool response at all

The model appears to automatically generate a follow-up response in parallel when executing a NON_BLOCKING function, and there seems to be no way to prevent this from the client side.

Jean-Marc_Gourier · January 9, 2026, 2:13pm

I had the same issue.

Srikanta_K_N · January 12, 2026, 6:28am

Hi @Ei_Kyaw, @Jean-Marc_Gourier ,

Welcome to the community! Apologies for the late response.

I can see you have attached a log, but to understand the issue better, could you please share the code snippet (the function definition part) and the exact prompt you sent the model? I can see you sent ‘Hello‘ to the model, but just wanted to confirm if there is any specific prompt this is happening with!

Thank you!

Erich · January 15, 2026, 11:00am

You can find the code definition in my public github: GitHub - studerus/pepper-android-realtime-chat: Open-source Android framework for low-latency, LLM-driven multimodal interaction on Pepper. Uses end-to-end speech-to-speech models and extensive Function Calling for agentic robot control (navigation, gaze, vision, touch). Also runs on regular Android devices.

It happens regardless of the prompt at the beginning of the conversation.

timE · January 22, 2026, 12:06pm

I have exactly the same error.

Topic		Replies	Views
[Bug] Multimodal Live API (v1beta) triggers identical tool calls twice in rapid succession Gemini API bug , api , gemini , live-streaming	4	78	April 10, 2026
Gemini Live API Tool Calling Issues – Inconsistent Behavior and Empty Tool Responses Gemini API api , function-calling	8	885	June 20, 2025
Model cannot focus on most recent user request when function calling Gemini API api , models	6	451	February 2, 2025
Multimodal Live API Returns Executable Code Instead of Expected Function Call Response Gemini API api	4	260	April 14, 2025
Gemini 2.0 experimental function calling doesn't return the function most of the time Gemini API new-features , model , train-function	3	650	February 13, 2025

Scheduling: "SILENT" in NON_BLOCKING function response not preventing duplicate audio generation

Related topics