Handling user interruptions with gemini-live-2.5-flash vertex ai model

I am developing voice bot using gemini-live-2.5-flash. I am using google ADK to develop the bot
For VAD i am using default google settings. Changed start sensitivity to high.
In model response i receive interruption event but model keep on sending content event after that .

Expected Output: Model should stop sending content events after user interrupts with new input

my current setting to for runner is

voice_config = VoiceConfig(

prebuilt_voice_config=PrebuiltVoiceConfigDict(
    #voice_name='Sulafat',
    voice_name='Zephyr', 
    

)

)

speech_config = SpeechConfig(voice_config=voice_config)

realtime_input_config = RealtimeInputConfig(activity_handling=ActivityHandling.START_OF_ACTIVITY_INTERRUPTS,
                                            automatic_activity_detection=AutomaticActivityDetection(disabled=False,start_of_speech_sensitivity=StartSensitivity.START_SENSITIVITY_HIGH,end_of_speech_sensitivity=EndSensitivity.END_SENSITIVITY_HIGH)
                                            )

proactive_config = ProactivityConfig(proactive_audio=True)
audio_transcription_config = AudioTranscriptionConfig()
run_config = RunConfig(response_modalities=["AUDIO"],
                       speech_config=speech_config,
                       streaming_mode='bidi',
                       proactivity=proactive_config,
                       enable_affective_dialog=True,
                       realtime_input_config=realtime_input_config,
                       
                    #    input_audio_transcription=audio_transcription_config,
                    #  output_audio_transcription=audio_transcription_config
                       )


live_request_queue = LiveRequestQueue()
live_events = runner.run_live(
    session=session,
    live_request_queue=live_request_queue,
    run_config=run_config,

Am i missing something ??

1 Like

Hi @hitish_singla Welcome to the community
Could you please share the detailed event logs when the issue occurs ?
Could you also please explain your client-side code that processes the live_events?
Thank you

here is how i handle live events

async for event in session.live_events:

        if event.turn_complete or event.interrupted:
            
            flush_msg = {
                "event": "media",
                "streamSid": session.stream_sid,
                "media": {"payload": ""}
            }
            await websocket.send_text(json.dumps(flush_msg))

            mark_message = {
                "event": "mark",
                "streamSid": session.stream_sid,
                "mark": {
                    "name": "agent_turn_complete"

                    
                }
            }
            await websocket.send_text(json.dumps(mark_message))
            print(f"[{session.stream_sid}][AGENT -> TWILIO]: Sent mark event.")
            continue
        
        
        
        part: Part = (event.content and event.content.parts and event.content.parts[0])

        if not part:
            continue
   
   
        is_audio = part.inline_data and part.inline_data.mime_type.startswith("audio/pcm")
        #  print(f"[{session.stream_sid}][AGENT -> APP]: Part is audio: {is_audio}")

I am getting interruption events . so i am flushing current message.

but after interruption message i keep on getting content events. model does not stop sending events

Any solutions for the problem can you reproduce the issue?

I apologize for the delayed response and for missing your question. I was unable to reproduce the issue on my end.

Based on the code you shared, I believe we need to improve the event handling logic. It would be beneficial to implement a mechanism that can suppress additional content events and prioritize interruption events.

I can provide the logic for these changes if you’d like

Thanks for the response. It would be great if you can provide me the logic / changes

Can you try this logic once and let me know if it works

Add interruption state tracking (before your event loop)

is_interrupted = False

async for event in session.live_events:

# PRIORITY 1: Handle interruptions FIRST and IMMEDIATELY
if event.interrupted:
    is_interrupted = True  # Set flag to suppress content
    print(f"[{session.stream_sid}][INTERRUPTION]: User interrupted - suppressing content")
    
    # Immediate flush to stop current audio
    flush_msg = {
        "event": "media",
        "streamSid": session.stream_sid,
        "media": {"payload": ""}
    }
    await websocket.send_text(json.dumps(flush_msg))

    mark_message = {
        "event": "mark",
        "streamSid": session.stream_sid,
        "mark": {"name": "agent_interrupted"}
    }
    await websocket.send_text(json.dumps(mark_message))
    print(f"[{session.stream_sid}][AGENT -> TWILIO]: Sent interruption mark.")
    continue  # Skip to next event immediately

# PRIORITY 2: Handle turn complete (reset interruption state)
if event.turn_complete:
    is_interrupted = False  # Reset flag to allow new content
    print(f"[{session.stream_sid}][TURN_COMPLETE]: Resetting interruption state")
    
    flush_msg = {
        "event": "media",
        "streamSid": session.stream_sid,
        "media": {"payload": ""}
    }
    await websocket.send_text(json.dumps(flush_msg))

    mark_message = {
        "event": "mark",
        "streamSid": session.stream_sid,
        "mark": {"name": "agent_turn_complete"}
    }
    await websocket.send_text(json.dumps(mark_message))
    print(f"[{session.stream_sid}][AGENT -> TWILIO]: Sent mark event.")
    continue

# PRIORITY 3: Suppress content events if interrupted
if is_interrupted:
    print(f"[{session.stream_sid}][SUPPRESSED]: Content event discarded due to interruption")
    continue  # Skip all content processing when interrupted
    
# PRIORITY 4: Process content only if NOT interrupted
part: Part = (event.content and event.content.parts and event.content.parts[0])

if not part:
    continue

is_audio = part.inline_data and part.inline_data.mime_type.startswith("audio/pcm")

@Pannaga_J Thanks for the logic. but with this there will awkward silence between 2 responses from the model.
Like for example if model is providing a list of 10-15 items and user selects second item.
model will respond to product choice after model completes generating events for 10-15 products. and user will feel model is not responding.

Should not model stop generating next events??

@hitish_singla The mistake was suppressing all content until the model finishes generating its complete planned response.

The fix is to handle the interruption, send the flush, then immediately reset flags rather than keeping them active until the model internally completes its full generation.
This way the system stops the current audio but doesn’t block new conversation flow.

Yes understand the fix will work for most of cases . but i am only worried in case model response is very long and interruption happens.

I was also trying to implement sileroVAD for turn detection.

I am sending start event like
start_activity_event = LiveClientRealtimeInput(activity_start=ActivityStart())
live_request_queue.send_realtime(start_activity_event)
session.user_activity_started = True
print(f"[{stream_sid}][VAD]: Sent start_activity to agent")
but it is not working
I am unable to find correct method to send start and end events to the agent. Can you please guide me how to send these events.
I am using google adk to create live agents

Can you check this out once

There is explanation how to call start and end events .