Gemini 2.5 Native Dialog audio problems

aljazdolenc · June 21, 2025, 7:49am

Description:

We were running gemini 2.5 native dialog in production and we noticed past week that it randomly started being cut off or interrupted (not generating complete audio response).

Returned output transcription returns sentence with 10 words, but returned audio say only 1.5 - 2.5 of words and its cut off and than it stays silente for a few moments.

On which platform:

on ai studio and live api

When it happens:

responseTokenCount module of 50:
- We noticed that if message.usageMetadata.responseTokenCount is module of 50; responseTokenCount % 50 === 0 the audio generated by assistant for that specific turn is not going to be complete
at what time:
- we noticed that it can happen at the begging of first response, in the middle of conversation or when the model gets interuppted often

We tried:

muting incoming audio:
- in 1/5 times the intial greeting was cut off after assistant said 2 of 10 words it needed to say
changing temperature:
- we changed temperature between 0.6 to 1 (no difference in occurance)
changing voices
different combinations:
- different combinations of using and not using affective dialog, proactivity, speech sensitivity, VAD

Assumptions:

during peek hours it happens more often
this started happening around 17. June

Closing thoughts:

who should we contact in regards to this, since it affects our product badly
we could provide in depth analysis and log of messages (events) sent to us from Live API when we were testing this

AIchievable · June 21, 2025, 1:15pm

Interesting, I have the same problem, so it’s not just me.
https://discuss.ai.google.dev/t/2-5-flash-audio-native-output-broken-in-de/89973

For me it feels quite random, cause it can happen in the 2nd answer in the 8th or whenever, but one thing you mentioned seems interesting to that it happens more offen when interrupted. For me it was actually changing in frequency based on the topic, but maybe just coincidence.

Jakob_Sustersic · June 21, 2025, 6:35pm

bumping upper ones

Thinh_Truong · June 23, 2025, 12:32pm

Hey it’s interesting that you mentioned June 17th, Google released a bunch of Gemini 2.5 models to public GA, including this model or a variant of this model on Vertex AI. We also noticed that the model is now worse than it used to be and it’s not consistent at all in its performance, I hope this gets addressed soon.

aljazdolenc · June 24, 2025, 2:55pm

@chunduriv @Logan_Kilpatrick hasnt anyone at google noticed that native audio is causing errors? How come that previews are not versioned with dates, just released. What is going on with googles standards

chunduriv · June 24, 2025, 10:00pm

Hi @aljazdolenc,

Thank you for tagging me here and bringing this to our attention. We’ve observed this on our end and escalated it to the internal team.

chokoswitch · July 2, 2025, 4:20am

Thanks for the thread - I see the same problems with the model stopping. It seems to happen both on input and output. When it asks me a question, I respond and get no follow up. After a few “hello?” type of nudges, finally it will follow up on the response I gave so it wasn’t actually missed, just frozen.

On output side also, sometimes it just stops mid-way. With some nudging like “continue”, it may finally continue right where it left off, or sometimes restarts.

It’s generally pretty random where issues happen but affects almost all conversations I try at some point, so basically it’s not working. I have no issues with stopping with gemini-live-2.5-flash-preview so don’t think it is an issue with integration (I am using JS SDK from the browser with an ephemereal token) so I am continuing with it but the audio is so much better with nativeaudio so looking forward to it becoming usable. Note, my team mate is really pushing me to switch to OpenAI Realtime API which I’m currently resisting but may not be able to much longer

metzpapa · July 9, 2025, 1:41am

I am having issues with this as well. nice to know that at least ( or i hope its specific to just the flash model and somehow this is some cost saving measure.) For me, the issue I’m experiencing is sometimes Gemini will switch into transcription mode, and then for the rest of the conversation it’ll only get text transcripts, and I can’t get it to analyze raw audio. And then the other half of the time, I can get it to analyze the raw audio, and then it just switches into raw audio mode, and this is great because I need to analyze raw audio. And this happens in Google AI Studio too with the Pro model, so I’m assuming it’s not actually just with the Flash model. I don’t know, that’s just really frustrating because I just want one functionality. It switches so much.

Bzitar_Ilyass · July 11, 2025, 8:05pm

chokoswitch:

Thanks for the thread - I see the same problems with the model stopping. It seems to happen both on input and output. When it asks me a question, I respond and get no follow up. After a few “hello?” type of nudges, finally it will follow up on the response I gave so it wasn’t actually missed, just frozen.

On output side also, sometimes it just stops mid-way. With some nudging like “continue”, it may finally continue right where it left off, or sometimes restarts.

It’s generally pretty random where issues happen but affects almost all conversations I try at some point, so basically it’s not working. I have no issues with stopping with gemini-live-2.5-flash-preview so don’t think it is an issue with integration (I am using JS SDK from the browser with an ephemereal token) so I am continuing with it but the audio is so much better with nativeaudio so looking forward to it becoming usable. Note, my team mate is really pushing me to switch to OpenAI Realtime API which I’m currently resisting but may not be able to much longer

the same happening to me also with : gemini-2.0-flash-live-001 but its rare for this model

Topic		Replies	Views
2.5 flash audio native - output broken in DE Gemini API models	4	122	June 24, 2025
Gemini flash 2.0 API sometimes would stop outputting (paused) Gemini API feedback , prompt	18	1346	March 6, 2025
Live API Native Model doesnt do Function Calls Gemini API api , function-calling	13	309	July 16, 2025
Why it takes 20s to answer in audio, for the Gemini Flash 2.0 exp model? Google AI Studio models , audio	5	97	June 18, 2025
Severe Degradation in Gemini Flash 2.0 API Performance — Tool Use and Output Quality Affected Gemini API model-quality	0	291	April 9, 2025