Hello! I’m currently using gemini-live-2.5-flash-preview to power my website Homeway. I’m using dotnet, so I wrote my own WebSocket impl because I couldn’t find one in the official SDK. I’m going text → text chat completions, with tools, search grounding, etc. It’s all working well**,** and I have been very impressed with the latency.
I got an email saying that gemini-live-2.5-flash-preview was being replaced by "`gemini-2.5-flash-native-audio-preview-09-2025` But when I swap the model strings, I my websocket is closed after I send the config object with the error:
Cannot extract voices from a non-audio request
My config object is set up with the output to only be TEXT, and I don’t create any of the voice-related generation config subobjects.
The website says the audio-preview model supports text as input and output, so it seems like it should work. Do you have any idea what I need to do to fix this? Do I need to set up AUDIO as a possible output but never use it?
Thanks!