Significant delay with Gemini Live 2.5 Flash (native audio)

shabeer_k · February 12, 2026, 7:29am

We’re seeing noticeable end-to-end latency with Gemini Live (real-time voice) even when using gemini-live-2.5-flash-native-audio with thinking disabled. We are using us-central1 region in vertex ai. We’d like to share our setup and ask if this matches others’ experience or if there are recommended changes.

Use case

Real-time voice assistant (OB-GYN clinical assistant): user speaks, model replies with native audio.
Flow: WebSocket Live API → we send audio, receive audio + transcriptions; we also use function/tool calling during the conversation.

const config = {
  systemInstruction: "<string>",  // one system prompt; we use a single text systemInstruction
  tools: [geminiTools],           // function declarations for tool/function calling
  generationConfig: {
    maxOutputTokens: 4096,
    thinkingConfig: {
      thinkingBudget: 0           // thinking disabled
    },
    temperature: 0.5
  },
  responseModalities: [Modality.AUDIO],
  outputAudioTranscription: {},
  realtimeInputConfig: {
    activityHandling: ActivityHandling.NO_INTERRUPTION
  },
  contextWindowCompression: { slidingWindow: {} },
  sessionResumption: resumptionHandle ? { handle: resumptionHandle } : {}
};

ai.live.connect({
  model: 'gemini-live-2.5-flash-native-audio',
  config,
  callbacks: { ... }
});

We are experiencing 10-15 seconds to get a response from the model, without function calls. Is this level of latency expected for this model + tools, or are there settings that usually help reduce delay?

Topic		Replies	Views
Unusably high lag using Gemini Live API, even in Vertex AI studio Gemini API gemini , audio , vertex-ai , live-streaming	0	21	March 27, 2026
Latency regression after deprecation of gemini-2.0-flash-exp (500ms → 1800ms) Gemini API api , models , gemini , gemini-flash	1	187	January 29, 2026
Gemini Live API Response Delay Issue Gemini API api , performance	9	521	December 5, 2025
Gemini Live API models high Latency Gemini API api , models , gemini	11	782	December 11, 2025
Gemini 2.5 Flash audio response latency doubled / tripled after 3.0 Pro release? Gemini API audio	4	357	December 17, 2025

Significant delay with Gemini Live 2.5 Flash (native audio)

Use case

Related topics