Handling user interruptions with gemini-live-2.5-flash vertex ai model

hitish_singla · July 21, 2025, 8:44am

I am developing voice bot using gemini-live-2.5-flash. I am using google ADK to develop the bot
For VAD i am using default google settings. Changed start sensitivity to high.
In model response i receive interruption event but model keep on sending content event after that .

Expected Output: Model should stop sending content events after user interrupts with new input

my current setting to for runner is

voice_config = VoiceConfig(

prebuilt_voice_config=PrebuiltVoiceConfigDict(
    #voice_name='Sulafat',
    voice_name='Zephyr', 
    

)

)

speech_config = SpeechConfig(voice_config=voice_config)

realtime_input_config = RealtimeInputConfig(activity_handling=ActivityHandling.START_OF_ACTIVITY_INTERRUPTS,
                                            automatic_activity_detection=AutomaticActivityDetection(disabled=False,start_of_speech_sensitivity=StartSensitivity.START_SENSITIVITY_HIGH,end_of_speech_sensitivity=EndSensitivity.END_SENSITIVITY_HIGH)
                                            )

proactive_config = ProactivityConfig(proactive_audio=True)
audio_transcription_config = AudioTranscriptionConfig()
run_config = RunConfig(response_modalities=["AUDIO"],
                       speech_config=speech_config,
                       streaming_mode='bidi',
                       proactivity=proactive_config,
                       enable_affective_dialog=True,
                       realtime_input_config=realtime_input_config,
                       
                    #    input_audio_transcription=audio_transcription_config,
                    #  output_audio_transcription=audio_transcription_config
                       )


live_request_queue = LiveRequestQueue()
live_events = runner.run_live(
    session=session,
    live_request_queue=live_request_queue,
    run_config=run_config,

Am i missing something ??

Pannaga_J · July 24, 2025, 7:40am

Hi @hitish_singla Welcome to the community
Could you please share the detailed event logs when the issue occurs ?
Could you also please explain your client-side code that processes the live_events?
Thank you

hitish_singla · July 25, 2025, 5:59am

here is how i handle live events

async for event in session.live_events:

        if event.turn_complete or event.interrupted:
            
            flush_msg = {
                "event": "media",
                "streamSid": session.stream_sid,
                "media": {"payload": ""}
            }
            await websocket.send_text(json.dumps(flush_msg))

            mark_message = {
                "event": "mark",
                "streamSid": session.stream_sid,
                "mark": {
                    "name": "agent_turn_complete"

                    
                }
            }
            await websocket.send_text(json.dumps(mark_message))
            print(f"[{session.stream_sid}][AGENT -> TWILIO]: Sent mark event.")
            continue
        
        
        
        part: Part = (event.content and event.content.parts and event.content.parts[0])

        if not part:
            continue
   
   
        is_audio = part.inline_data and part.inline_data.mime_type.startswith("audio/pcm")
        #  print(f"[{session.stream_sid}][AGENT -> APP]: Part is audio: {is_audio}")

hitish_singla · July 25, 2025, 6:00am

I am getting interruption events . so i am flushing current message.

but after interruption message i keep on getting content events. model does not stop sending events

hitish_singla · August 1, 2025, 8:10am

Any solutions for the problem can you reproduce the issue?

Pannaga_J · August 1, 2025, 9:52am

I apologize for the delayed response and for missing your question. I was unable to reproduce the issue on my end.

Based on the code you shared, I believe we need to improve the event handling logic. It would be beneficial to implement a mechanism that can suppress additional content events and prioritize interruption events.

I can provide the logic for these changes if you’d like

hitish_singla · August 1, 2025, 10:25am

Thanks for the response. It would be great if you can provide me the logic / changes

Pannaga_J · August 1, 2025, 6:02pm

Can you try this logic once and let me know if it works

Add interruption state tracking (before your event loop)

is_interrupted = False

async for event in session.live_events:

# PRIORITY 1: Handle interruptions FIRST and IMMEDIATELY
if event.interrupted:
    is_interrupted = True  # Set flag to suppress content
    print(f"[{session.stream_sid}][INTERRUPTION]: User interrupted - suppressing content")
    
    # Immediate flush to stop current audio
    flush_msg = {
        "event": "media",
        "streamSid": session.stream_sid,
        "media": {"payload": ""}
    }
    await websocket.send_text(json.dumps(flush_msg))

    mark_message = {
        "event": "mark",
        "streamSid": session.stream_sid,
        "mark": {"name": "agent_interrupted"}
    }
    await websocket.send_text(json.dumps(mark_message))
    print(f"[{session.stream_sid}][AGENT -> TWILIO]: Sent interruption mark.")
    continue  # Skip to next event immediately

# PRIORITY 2: Handle turn complete (reset interruption state)
if event.turn_complete:
    is_interrupted = False  # Reset flag to allow new content
    print(f"[{session.stream_sid}][TURN_COMPLETE]: Resetting interruption state")
    
    flush_msg = {
        "event": "media",
        "streamSid": session.stream_sid,
        "media": {"payload": ""}
    }
    await websocket.send_text(json.dumps(flush_msg))

    mark_message = {
        "event": "mark",
        "streamSid": session.stream_sid,
        "mark": {"name": "agent_turn_complete"}
    }
    await websocket.send_text(json.dumps(mark_message))
    print(f"[{session.stream_sid}][AGENT -> TWILIO]: Sent mark event.")
    continue

# PRIORITY 3: Suppress content events if interrupted
if is_interrupted:
    print(f"[{session.stream_sid}][SUPPRESSED]: Content event discarded due to interruption")
    continue  # Skip all content processing when interrupted
    
# PRIORITY 4: Process content only if NOT interrupted
part: Part = (event.content and event.content.parts and event.content.parts[0])

if not part:
    continue

is_audio = part.inline_data and part.inline_data.mime_type.startswith("audio/pcm")

hitish_singla · August 4, 2025, 7:23am

@Pannaga_J Thanks for the logic. but with this there will awkward silence between 2 responses from the model.
Like for example if model is providing a list of 10-15 items and user selects second item.
model will respond to product choice after model completes generating events for 10-15 products. and user will feel model is not responding.

Should not model stop generating next events??

Pannaga_J · August 4, 2025, 9:37am

@hitish_singla The mistake was suppressing all content until the model finishes generating its complete planned response.

The fix is to handle the interruption, send the flush, then immediately reset flags rather than keeping them active until the model internally completes its full generation.
This way the system stops the current audio but doesn’t block new conversation flow.

hitish_singla · August 4, 2025, 10:16am

Yes understand the fix will work for most of cases . but i am only worried in case model response is very long and interruption happens.

I was also trying to implement sileroVAD for turn detection.

I am sending start event like
start_activity_event = LiveClientRealtimeInput(activity_start=ActivityStart())
live_request_queue.send_realtime(start_activity_event)
session.user_activity_started = True
print(f"[{stream_sid}][VAD]: Sent start_activity to agent")
but it is not working
I am unable to find correct method to send start and end events to the agent. Can you please guide me how to send these events.
I am using google adk to create live agents

Pannaga_J · August 5, 2025, 10:28am

Can you check this out once

There is explanation how to call start and end events .

Topic		Replies	Views
Disable interruptions for audio streaming for multimodal live api Gemini API api	5	560	June 24, 2025
How do I prevent the Live API from discarding audio when it's given audio while it speaks? Gemini API api , gemini-api	10	394	June 24, 2025
Live API - PTT with external STT & Interruptions Gemini API gemini-api , prompt	2	120	August 6, 2025
Interrupting Gemini 2 Flash Multimodal Live API seem not to work as expected Gemini API gemini-flash	1	395	June 16, 2025
Real time, gemini 2 audio change? how to? Gemini API models , audio	4	459	January 9, 2025

Handling user interruptions with gemini-live-2.5-flash vertex ai model

Add interruption state tracking (before your event loop)

Related topics