Description:
We were running gemini 2.5 native dialog in production and we noticed past week that it randomly started being cut off
or interrupted
(not generating complete audio response).
Returned output transcription returns sentence with 10 words, but returned audio say only 1.5 - 2.5 of words and its cut off and than it stays silente for a few moments.
On which platform:
- on
ai studio
and live api
When it happens:
- responseTokenCount module of 50:
- We noticed that if
message.usageMetadata.responseTokenCount
is module of 50; responseTokenCount % 50 === 0
the audio generated by assistant for that specific turn is not going to be complete
- at what time:
- we noticed that it can happen at the begging of first response, in the middle of conversation or when the model gets interuppted often
We tried:
- muting incoming audio:
- in 1/5 times the intial greeting was cut off after assistant said 2 of 10 words it needed to say
- changing temperature:
- we changed temperature between 0.6 to 1 (no difference in occurance)
- changing voices
- different combinations:
- different combinations of using and not using
affective dialog
, proactivity
, speech sensitivity
, VAD
Assumptions:
- during peek hours it happens more often
- this started happening around 17. June
Closing thoughts:
- who should we contact in regards to this, since it affects our product badly
- we could provide in depth analysis and log of messages (events) sent to us from Live API when we were testing this
3 Likes
Interesting, I have the same problem, so it’s not just me.
https://discuss.ai.google.dev/t/2-5-flash-audio-native-output-broken-in-de/89973
For me it feels quite random, cause it can happen in the 2nd answer in the 8th or whenever, but one thing you mentioned seems interesting to that it happens more offen when interrupted. For me it was actually changing in frequency based on the topic, but maybe just coincidence.
2 Likes
Hey it’s interesting that you mentioned June 17th, Google released a bunch of Gemini 2.5 models to public GA, including this model or a variant of this model on Vertex AI. We also noticed that the model is now worse than it used to be and it’s not consistent at all in its performance, I hope this gets addressed soon.
2 Likes
@chunduriv @Logan_Kilpatrick hasnt anyone at google noticed that native audio is causing errors? How come that previews are not versioned with dates, just released. What is going on with googles standards
Hi @aljazdolenc,
Thank you for tagging me here and bringing this to our attention. We’ve observed this on our end and escalated it to the internal team.
2 Likes
Thanks for the thread - I see the same problems with the model stopping. It seems to happen both on input and output. When it asks me a question, I respond and get no follow up. After a few “hello?” type of nudges, finally it will follow up on the response I gave so it wasn’t actually missed, just frozen.
On output side also, sometimes it just stops mid-way. With some nudging like “continue”, it may finally continue right where it left off, or sometimes restarts.
It’s generally pretty random where issues happen but affects almost all conversations I try at some point, so basically it’s not working. I have no issues with stopping with gemini-live-2.5-flash-preview so don’t think it is an issue with integration (I am using JS SDK from the browser with an ephemereal token) so I am continuing with it but the audio is so much better with nativeaudio so looking forward to it becoming usable. Note, my team mate is really pushing me to switch to OpenAI Realtime API which I’m currently resisting but may not be able to much longer 
1 Like