Gemini-2.5-flash-native-audio-preview-09-2025 Text -> Text Only Not Working

Quinn_Damerell · December 8, 2025, 4:19pm

That’s a great workaround! I assume that uses the audio billing though, which is more expensive? I also played with the transcript for incoming audio a little bit and it had quite a lot of accuracy issues. I’m not sure if the same thing would happen for output.

alexagr · December 9, 2025, 5:20pm

There are quite a few issues with this workaround:

the model generates AUDIO tokens - which are considerably more expensive than TEXT tokens
turn_complete event arrives after all AUDIO tokens are generated - for long sentences it may take 10 seconds and even longer
while you could use generation_complete event instead - this would only shorten the response time for the 1st user utterance, as model is not ready to process new input until turn_complete event is generated
model sometimes simply fails to generate ANY output - neither audio chunks nor output transcription - and just returns turn_complete; this happens for me quite often when I restore context and use tools to return extracts from the documents (agentic RAG pipeline)

Bottom line - this all looks like one dirty hack rather than a clean API that gemini-live-2.5-flash model used to have.

Quinn_Damerell · December 10, 2025, 6:51pm

That’s a great breakdown, thanks for sharing. So I guess we just keep using it and hope they don’t pull the model out from under our feet? Hahaha. I’m still holding out hope we will get some official response from Google about what to do here.

alexagr · December 10, 2025, 8:44pm

We are past the official deadline - and they pulled it from “models” page - so I wouldn’t hold my breath…

RishaanDevz · December 10, 2025, 9:23pm

Yeah this is an unfortunate turn of events… One of the projects at my work relies on Gemini Live Text Out so we need a Google fix on this ASAP..

@Lalit_Kumar is there still no update on this?

Quinn_Damerell · December 10, 2025, 9:33pm

Yup, it looks like it was just killed. @Lalit_Kumar @Liam_Carter can anyone help?

Quinn_Damerell · December 10, 2025, 9:37pm

I’m going to grab a $35 support plan on Google Cloud and see if they can do anything to help.

RishaanDevz · December 10, 2025, 9:39pm

You’re amazing for that thank you!! Please let us know what they say.

Quinn_Damerell · December 10, 2025, 10:13pm

The website is also still showing the lite version of the model as being available as well, but it’s not working for be either.

RishaanDevz · December 10, 2025, 11:02pm

Isn’t lite only for the mainline api, not live?

Michael_B · December 11, 2025, 12:25am

Yes, it’s not ideal, and I would also prefer to get plain text output directly. Hopefully they’ll find a proper solution soon.
That said, it hasn’t worked too badly for me so far. I instructed the LLM to output JSON, and up to now the responses have been valid. The real-time speed was acceptable as well.

Since the other models were just turned off, there isn’t much of an alternative at the moment, so this workaround will have to do. I’m not exactly happy about it, but for my purposes it works well enough until they provide a proper fix.

Bishoy_Michail · December 11, 2025, 12:27am

+1 on this issue… 2 days after the deprecation, I’m surprised more people aren’t complaining

Quinn_Damerell · December 11, 2025, 12:35am

Yeah, I added a workaround of using the REST version instead of real-time, but it’s about x3 slower, especially when handling function calls. I can’t take the hit of using audio tokens for output, so I need to figure out another way. If they don’t fix it, I’m thinking I will have to jump over to OpenAI.

Quinn_Damerell · December 11, 2025, 12:35am

It really seems like quite the miss. Even the updated model page says the the native audio model support text in and text out.

Quinn_Damerell · December 11, 2025, 12:44am

For anyone wondering, when the model stopped working, this is the message you got back.

models/gemini-live-2.5-flash-preview is not found for API version v1beta, or is not supported for bidiGenerateContent

I’m mostly posting that so hopefully anyone who Google-searches that error will find this thread.

M_Janssen · December 11, 2025, 7:38am

+1 here. gemini-live-2.5-flash-preview was a reliable model to get text output with audio streaming input. The audio generated by the newer models is just not good enough, it’s inconsistent and the pronunciation for non-english languages is sub-par.

Using text output and Chirp3 text-to-speech is working well in our business use-cases, but for our business the instabilities of these API’s and model availability is making us reconsider Gemini.

RishaanDevz · December 11, 2025, 1:54pm

yeah so my company just jumped ship to openai’s realtime api, they support audio in text out with 4o realtime. Not ideal but it works. Wasn’t expecting this bad of a supprt response from Google but it is what it is I suppose

I advise anyone else who’s facing this to jump ship even if its paid cause google won’t resolve this until Gemini 3 Live drops and even then I’m skeptical that it’ll work like 2.5 live.

Quinn_Damerell · December 11, 2025, 3:47pm

I have a support thread running with the Google Cloud team and they are going to escalate to engineering to ask.

I added a REST api impl of the same model and my latency junped from 1.5 seconds on average to 7s.

I also learned that the Live API is the only one that supports the use of tools, like Google Search Grounding, and custom functions. That’s critical for my use case, since I’m building a Google Home style AI assistant for Home Assistmat. So I need it to know current information from search, but also control the users home with my custom function calls.

So I’m hopeful my support case will get somewhere and I can keep using Gemini. I switched from OpenAI about 6 months ago bc flash 2.5 cost less and preformed way better than the current realtime OpenAI, which I think is still 4o based.

RishaanDevz · December 12, 2025, 12:13pm

Honestly I’m only using OpenAI now because i checked and there’s a new model called GPT-Realtime which is supposedly based on 5? It’s pretty solid tbh but the migration is a bit of a headache cause the systems are kinda different and function call arrangements are set up differently.

Overall worth it tho

Quinn_Damerell · December 12, 2025, 5:38pm

Luckily, when I ported to Gemini, I made an abstraction for the function call logic, so for me, switching that back should be quite straightforward. I’m going to prototype switching today.

The Google Cloud support person I have is very useful, and they sent a summary of the issue to the engineering team, so I’m hoping we might get something concrete back soon.

Topic		Replies	Views
2.5 flash audio native - output broken in DE Gemini API models	8	546	October 18, 2025
Critical Regression in native-audio-preview & Deprecation Confusion for Dec 9, 2025 Gemini API api , live-streaming	5	391	January 12, 2026
Gemini Live Not Responding Correctly to Text Gemini API api , models	7	471	September 3, 2025
How to get text output from gemini-2.5-flash-preview-native-audio-dialog Gemini API showcase	4	1124	November 3, 2025
Voice_name is not working for "gemini-live-2.5-flash-preview" Gemini API gemini-flash-2-5	3	126	November 19, 2025

Gemini-2.5-flash-native-audio-preview-09-2025 Text -> Text Only Not Working

Related topics