Live API latency spikes

OverFitter · October 3, 2025, 7:14am

Using Live API with gemini-live-2.5-flash-preview on a audio/pcm;rate=8000 audio chunks stream and streaming responses, the latency sometimes spikes and the wait time goes to 7-15 seconds to first token (measuring from audio stream end). Narrowing down the problem, the most latency is coming from transcription (server_content.input_transcription) which took up to 30 seconds during testing (measuring from audio stream beginning)

Here is config we are using:
config = types.LiveConnectConfig(
realtime_input_config=types.RealtimeInputConfig(
automatic_activity_detection=types.AutomaticActivityDetection(
start_of_speech_sensitivity=types.StartSensitivity.START_SENSITIVITY_HIGH,
end_of_speech_sensitivity=types.EndSensitivity.END_SENSITIVITY_LOW,
silence_duration_ms=int(silence_duration * 1000),
),
turn_coverage=types.TurnCoverage.TURN_COVERAGE_UNSPECIFIED,
),
response_modalities=[“TEXT”],
system_instruction=types.Content(
parts=[types.Part.from_text(text=f"{preprompt}\n{system_instruction}")],
role=“user”,
),
media_resolution=“MEDIA_RESOLUTION_MEDIUM”,
input_audio_transcription=dict(),
speech_config=types.SpeechConfig(
language_code=language_iso_code,
voice_config=types.VoiceConfig(prebuilt_voice_config=types.PrebuiltVoiceConfig(voice_name=“Puck”)),
),
context_window_compression=types.ContextWindowCompressionConfig(
trigger_tokens=25600,
sliding_window=types.SlidingWindow(target_tokens=12800),
),
)

Mahesh_Sutar · November 25, 2025, 8:46am

Hello
welcome to the forum!!

the random 15–30s latency spikes because of Using END_SENSITIVITY_LOW on 8kHz audio is problematic. The model interprets background line static as “whispering,” causing it to keep the turn open until the hard server timeout (~30s).

The Solution:

Set Sensitivity to HIGH: end_of_speech_sensitivity=HIGH. This forces the model to ignore the static.

Increase Silence Duration: Set silence_duration_ms=1000. Since sensitivity is high, this 1-second buffer ensures natural pauses aren’t cut off.

Remove media_resolution: This parameter is for video/images only and should be removed from audio-only configs to keep things clean.

Thanks

Topic		Replies	Views
Live API : 5-6 second Response Latency Gemini API gemini , live-streaming	1	97	February 16, 2026
Gemini Live API models high Latency Gemini API api , models , gemini	11	724	December 11, 2025
Gemini Live API: Delays or Missing input_audio_transcription Events Gemini API bug , api , models , gemini	12	508	January 9, 2026
Latency problems API gemini 2.0 flash multimodal life Gemini API api , audio , gemini-flash , gemini-20	2	170	March 25, 2025
Gemini Live API Response Delay Issue Gemini API api , performance	9	475	December 5, 2025

Live API latency spikes

Related topics