Received 1007 invalid payload using Gemini Live API

Using gemini live API, I’m sending text data as per documentation:

await session.send_client_content( turns={"role": "user", "parts": [{"text": message}]}, turn_complete=True )
However, occasionally (5 out of 10 times) I’m getting invalid payload error when receiving data from gemini:

async for response in session.receive():

I am catching websocket ConnectionClosedError and logging it:

received 1007 invalid frame payload data) Request contains an invalid argument.; then sent 1007 (invalid frame payload data)

Request contains an invalid argument.

what am I doing wrong? The error also does not mentions which argument is invalid.

Is anyone else facing similar issue?

I am also facing the same issue. The conversation works for the first interaction but it throws this error from the next turn

This just started happening to me too, here’s the relevant excerpt from my logs:

received event LiveServerMessage { setupComplete: {} }
received event LiveServerMessage { serverContent: { outputTranscription: {} } }
received event LiveServerMessage {
  serverContent: { outputTranscription: { finished: true } }
}
live onclose clean? true code 1007 reason Request contains an invalid argument.

Latest JS SDK, no recent changes to my connection params or what I send to the model.

Did you ever resolve this?

I still haven’t found why is this happening or how to resolve this.

For model: gemini-live-2.5-flash-preview-native-audio
I found this : if response_modalities contains TEXT, this error will occur.
If response_modalities only contain AUDIO, this error will not happen.

If you use python google-genai, response_modalities is in genai.types.LiveConnectConfig, and will be used as client.aio.live.connect’s param, config.

if you use native websocket, the response_modalities will be send at first:
{
“setup”: {
“model”: MODEL,
“generation_config”: {
“response_modalities”: [“AUDIO”],
},
}
}

found the correct way to get both text and audio output from gemini-2.5-flash-preview-native-audio-dialog.

:white_check_mark: Working Solution

Don’t use response_modalities=["AUDIO", "TEXT"] - this causes errors.

Instead, use output_audio_transcription:

python

config = types.LiveConnectConfig(
    response_modalities=["AUDIO"],  # Audio only here
    output_audio_transcription=types.AudioTranscriptionConfig()  # This enables text
)

Then handle both outputs in your receive loop:

python

async for response in session.receive():
    # Text transcription
    if response.server_content and response.server_content.output_transcription:
        text = response.server_content.output_transcription.text
        display_subtitles(text)  # Perfect for subtitles!
    
    # Audio data
    if response.server_content and response.server_content.model_turn:
        for part in response.server_content.model_turn.parts:
            if part.audio and part.audio.data:
                play_audio(part.audio.data)

Hope this helps others facing the same issue!

1 Like

Thanks a ton, this hack works. After 2 weeks of grind to solve this issue this finally fixes it.
I also asked the Google rep’s the same question and they pointed to this ticket: Using gemini-2.5-flash-preview-native-audio-dialog · Issue #335 · google/adk-docs · GitHub which also does not help for this issue.

I use ADK and so this is the fix for now:

I used this:

 # Set response modality
            modality = "AUDIO"
            run_config = RunConfig(response_modalities=[modality], output_audio_transcription=AudioTranscriptionConfig())

            # Create a LiveRequestQueue for this session
            live_request_queue = LiveRequestQueue()

            # Start agent session
            live_events = runner.run_live(
                session=session,
                live_request_queue=live_request_queue,
                run_config=run_config,
            )
            
            return live_events, live_request_queue, runner, session

instead of this and it resolved the issue:

 # Set response modality
            modality = "AUDIO" if is_audio else "TEXT"
            run_config = RunConfig(response_modalities=[modality])

            # Create a LiveRequestQueue for this session
            live_request_queue = LiveRequestQueue()

            # Start agent session
            live_events = runner.run_live(
                session=session,
                live_request_queue=live_request_queue,
                run_config=run_config,
            )
            
            return live_events, live_request_queue, runner, session

As long as else "TEXT" is not there, it does not throw the error.

I’m using the JavaScript SDK (@google/genai) with the Gemini Live API (models/gemini-2.5-flash-exp).
After sending turnComplete, the session closes with:

Close event: {
  code: 1007,
  reason: 'Request contains an invalid argument.',
  wasClean: true
}

This matches similar reports from the Python SDK (Gemini Live API never sends a response).


Setup

  • SDK: @google/genai
  • Model: gemini-2.5-flash-exp
  • Environment: Node.js 18 / TypeScript
  • Issue: Socket closes immediately after turnComplete

:receipt: Minimal Repro

import { GoogleGenAI, Modality, StartSensitivity } from "@google/genai";

const geminiClient = new GoogleGenAI({ apiKey: process.env.GOOGLE_API_KEY });

const session = await geminiClient.live.connect({
  model: "models/gemini-2.5-flash-exp",
  config: {
    responseModalities: [Modality.AUDIO],
    outputAudioTranscription: {}, // <-- Possibly invalid / empty type
    realtimeInputConfig: {
      automaticActivityDetection: { disabled: false, silenceDurationMs: 1200 },
    },
  },
});

session.sendRealtimeInput({
  audio: { mimeType: "audio/pcm;rate=16000", data: base64Audio },
});
session.sendRealtimeInput({ audioStreamEnd: true });
session.sendClientContent({ turnComplete: true }); // triggers 1007

Notes

  • The TypeScript definition for AudioTranscriptionConfig is empty in the SDK.
  • Removing outputAudioTranscription does not fix the issue.
  • Other users reported the same invalid argument error in the Python SDK when using Live API.

Question

Is this a schema issue in the JS SDK or related to the empty AudioTranscriptionConfig definition?
Any working example of a Live API session with turnComplete in JS would help.