Why discontinue "Gemini 2.0 Flash Live" and "Gemini 2.5 Flash Live"?

Benjamin_Hughes · October 15, 2025, 7:32am

The flash models are the only models that work for my usecase, which is examination of software-development exams. Here it is important that the model understands syntax and also that the model will correct the student.

The Native Audio models are too friendly and will hallucinate software that the student did not write. They dont understand coding syntax that well and are too friendly/psychophantic

You can see the difference in these two short videos:

I know a never native model has come out but it has the same problems.

Lalit_Kumar · October 15, 2025, 10:56am

Hello,

These models are still available for use, you refer to Gemini API documentation details about this models.
If you go through below mentioned section in Gemini documentation, you should get more information:

Benjamin_Hughes · October 15, 2025, 12:12pm

The models are being discontinued per an email i got yesterday from Google

BLA · October 16, 2025, 8:46pm

I’m completely confused too… While the older ‘2.5-flash-live-preview’ model works fine, the new ‘gemini-2.5-flash-native-audio-preview-09-2025’ is a no-go for my use case because it simply cannot reliably understand what I’m saying over the phone. It’s a real disappointment, and now I have to seriously consider alternative platforms.

Ilia_Sedoshkin · October 16, 2025, 9:24pm

does the “preview-09-2025” worse on calling the functions?
How do you investigate its audio understanding compared to 2.5-flash-live-preview ?

Benjamin_Hughes · October 17, 2025, 8:03am

I have not tested function calling with native models

For me its not the audio understanding that is the problem. Its the way the model behaves. Native models are too psychophantic. They just agree on everything the user will say. At lease from my testing.

Angel7777 · October 21, 2025, 1:24am

Sycophantic, I believe, is what you meant to say. However, I am unclear to what you mean by saying this, dealing with Gemini? Could you further expound on this? Please and thank you in advance. P.S. Not saying this is or is not happening, just needing a better understanding.

Benjamin_Hughes · October 21, 2025, 12:56pm

Ahh yeah sorry Sycophantic.

Yes i can at least for my usecase. I have gemini act as an examiner for a software development exam. The student has to write code that solves a small exercise by sharing his/her screen. In the exam it is important that the students code is correct. With the flash models they would not let a students continue if they see an error in the code.
The native models do not do this. They just accept pretty much anything the student says. Even hallucinating code the student has not written. You can see that in the second video i linked to.

In the prompt i write things like

- examine if the student understands the code he/she is writing
- Make sure that the syntax is correct.
- Remember to check the output of the code!
- And remember most importantly! You are a strict examiner running an exam. Your goal is to evaluate the students competencies throughly. Dont take the students word for something! Make sure the syntax is correct and that the student understands the code!
- Spend time on making sure the syntax is correct!!

But the native models does not really follow these instrcutions. They are just friendly helpers that say “That looks good“ even though there are multiple clear syntax errors in the code. Thats what i mean when i say Sycophantic. They agree too much with the student. This is a big problem when dealing with an exam situation

I hope that makes more sense!?

Angel7777 · October 21, 2025, 10:52pm

That makes perfect sense! Thank you for elaborating and sharing your specific use case.
That’s a fantastic real-world test. Your discovery that the native model’s tendency to be a
“friendly helper” makes it sycophantic is highly instructive. It highlights a critical challenge,
And I absolutely agree that this kind of precise, hands-on feedback is invaluable for the developers.
and I’m optimistic that the developers will use this kind of feedback to make Gemini amazing down the road!

Ilia_Sedoshkin · October 22, 2025, 6:20pm

From my tests comparing the halfcascade model( Gemini 2.5 Flash Live) vs new native models for AUDIO modality:

new model asks two quiestions in the same utterance much more frequent (which usually overwhlems user and feels unnatural)
new model sends SessionResumptionUpdate not as frequent as half cascade.
new model sends randomly each 30-40sec, the half cascade sends each time bot speaking
It leads to loosing context on reconnects.

Will you do anything about it?

Lalit_Kumar · November 3, 2025, 6:28am

Hello,
We are currently reviewing your issue internally. We appreciate your patience while we work on this.
Thank you for your understanding.

Ilia_Sedoshkin · November 4, 2025, 6:51pm

Thank you for the update!
Additional observation: it post thougths into the model transcript even if you set ThinkingConfig(thinking_budget=1024, include_thoughts=False)

Please consider keeping half cascade alive for a few more months.

Topic		Replies	Views
New streaming models not performing well with code Gemini API audio , live-streaming , gemini-flash-2-5	6	175	September 3, 2025
Live API discontinuation for gemini-live-2.5-flash-preview — degraded behavior, higher hallucinations, and no clear replacement? Gemini API api , live-streaming , gemini-flash-2-5	1	119	October 28, 2025
Gemini-2.5-flash-native-audio-preview-09-2025 Text -> Text Only Not Working Gemini API bug , api , models , gemini	12	509	November 26, 2025
Gemini Flash 002 is dumber than 001 Gemini API gemini-api	1	286	July 14, 2025
Did anything change with gemini-2.0-flash yesterday? Gemini API models , gemini-20	7	197	June 25, 2025

Why discontinue "Gemini 2.0 Flash Live" and "Gemini 2.5 Flash Live"?

Related topics