Hey folks,
I’m trying to send about 40k tokens or so, way less than what is permitted (1m) for gemini 2.0, and am seemingly having that break with the following websocket exception:
E websockets.exceptions.ConnectionClosedError: received 1007 (invalid frame payload data) Request trace id: ffa37544583b21f9, [ORIGINAL ERROR] generic::invalid_argument: Input request contains (44599) tokens, whic; then sent 1007 (invalid frame payload data) Request trace id: ffa37544583b21f9, [ORIGINAL ERROR] generic::invalid_argument: Input request contains (44599) tokens, whic
The code works, I can take the same logic and supply less content and get what I want.
The logic is fairly simple, and expects an audio output that we then stream and play as needed. Code below. Any thoughts as to why this is failing?
config = genai.types.LiveConnectConfig(
response_modalities=["AUDIO"],
system_instruction=genai.types.Content(
parts=[genai.types.Part(text=system_prompt)]
),
generation_config=genai.types.GenerationConfig(
temperature=self.settings.genai_model_temperature,
max_output_tokens=8192,
),
speech_config=genai.types.SpeechConfig(
voice_config=genai.types.VoiceConfig(
prebuilt_voice_config=genai.types.PrebuiltVoiceConfig(
voice_name=VOICES[0]
),
),
),
)
async with self.client.aio.live.connect(
model=self.settings.genai_model_name, config=config
) as session:
await session.send(combined_prompt, end_of_turn=True)
audio_data=[]
async for response in session.receive():
if not response.server_content.turn_complete:
for part in response.server_content.model_turn.parts:
if part.inline_data and part.inline_data.data:
audio_data.append(np.frombuffer(chunk, dtype="int16"))
with sd.OutputStream(samplerate=24000, channels=1, dtype="int16") as stream:
stream.write(np.concatenate(audio_data))
The goal is to improve the audio for CustomPod as the audio 2.0 has is incredible.