Live API : 5-6 second Response Latency

Madhu_Shantan · February 16, 2026, 8:22am

Hi team,

I’m seeing consistently high response latency with Gemini Live API in a production-like voice simulation setup.

Use case

We use Gemini as a simulated user/caller to test a target voice agent (target agent speaks first).
Flow is: Twilio Media Stream → Server → Gemini Live WS → Server → UI.

Model

gemini-2.5-flash-native-audio-preview-12-2025

What we are already doing (per docs / best practices)

Input audio sent as 16-bit PCM, 16kHz, mono (audio/pcm;rate=16000)
Output audio handled at 24kHz
Audio sent in 20ms chunks (within recommended 20–40ms)
Long-lived websocket session (no reconnect per turn)
Ordered ingest/serialized dispatch (no parallel frame processing)
Minimal client buffering (small startup buffer only)
Local/manual VAD with explicit activityStart / activityEnd
Manual silence threshold around 510ms
We log:
- activityEnd requested
- activityEnd sent
- queueDrainMs
- geminiProcessingMs
- first response audio timestamp

Observed behavior

End-to-end perceived latency often 5–6s, sometimes higher on first turn
In many turns, local queue drain is now low (often ~100–300ms), but geminiProcessingMs is often ~2.2–3.8s and sometimes ~5.4s
Example first turn from logs:
- queuedChunksAtActivityEnd: 83
- queueDrainMs: 1654ms
- geminiProcessingMs: 2245ms
Another run showed geminiProcessingMs around 5423ms even with very low queue drain

Questions

Is this latency profile expected for this model in speech-to-speech proxy architecture?
Are there known server-side factors (region, context growth, session duration, model settings) that cause 5s+ spikes?
Any recommended tuning for manual VAD + telephony streams beyond what we already do?

Madhu_Shantan · February 16, 2026, 8:24am

we observed this latency initially with automatic vad, but then moved to manual vad for explict control of response latency. but was surprised the long latency is still persists and it was actually gemini taking 4/5 sec + time to respond back which is unacceptable. can anyone please help here?

Topic		Replies	Views
Gemini Live API Response Delay Issue Gemini API api , performance	9	399	December 5, 2025
Significant delay with Gemini Live 2.5 Flash (native audio) Gemini API models , gemini , audio , gemini-flash-2-5	0	23	February 12, 2026
Live API latency spikes Gemini API bug , api , models , live-streaming , gemini-2-5	1	330	November 25, 2025
Persistent High Latency with `gemini-2.5-pro` Gemini API generative-ai , gemini-2-5	4	1106	July 26, 2025
Gemini Live API models high Latency Gemini API api , models , gemini	11	623	December 11, 2025