Hi,
I’m building a real-time voice stream application using the Gemini Live API and the Google Gen AI Go SDK.
With both the gemini-live-2.5-flash-preview and gemini-2.5-flash-preview-native-audio-dialog models, token generation suddenly stops, leaving sentences unfinished.
It’s a random issue, but it still doesn’t make the application reliable.
I thought it was a problem in my code, but then I saw the same thing happening in AI Studio.
I tried to dig deeper and did some debugging, thinking the problem was related to Voice Activity Detection (VAD). Obviously I did the tests using headphones, to dispel any doubts about whether the model could interpret its own audio as VAD.
I monitored the interrupted
parameter (response.server_content.interrupted
in Python). But I found that when the model suddenly stops generating tokens, the interrupted
parameter is always false
and is only set to true
if I actually interrupt the model intentionally. Also, the audio transcription is identical to the audio itself: it stops exactly where the audio left off.
I don’t know if it has an impact, but the language used in my application is Italian.
Do you have any advice for me, or is this a known issue due to the Gemini Live API’s Preview status?
+1 facing the same issue. Often see gemini live abruptly stop mid sentence.
we noticited that it was cut off when ever token count was module of 50. 1 month ago and still no fix
Hello,
Could you please share your code with me so that I can try to reproduce your issue?
Hello Lalit,
thanks for your feedback.
As other users stated in the post reported by @aljazdolenc, the problem also occurs in AI Studio. I also thought there was a problem with my code, but then I saw, as other users have reported, that it also occurs in AI Studio. So I think you could run some tests in AI Studio.
I’d like to point out that the language of my application is Italian. Perhaps the preview state of the Live models is more stable in English than in other languages.
In any case, since the code for the application I’m developing, written in Go, is a bit complex and currently has over 3,000 lines of code, I’m not sure what to share. I could share one of the main web handlers and the JavaScript part for the client browser. But that would still make for a very long post. Perhaps it’ll be easier if I share the Git repository.
Anyway, the web handler is heavily based on the example Google provides for the Go SDK. The client’s JavaScript code, instead, is a rework of the example code that Google provides with the Go SDK. I actually thought the token generation stopping might be caused by the use of deprecated code for audio streaming in the browser in Google’s provided Javascript example. So I modified it to use Web Workers. But that didn’t solve the problem. That’s when I thought I’d use AI Studio to see if it was actually a problem with my code or not.
I look forward to your feedback.
Thanks again
Ciao
Hello,
We have escalated your issue to our internal team and we will get back to you at earliest.
I’m having a similar issue with trying to have it do French first then English in the same response.
Sadly it’s pretty unusable for me right now hope there will be a fix soon
Hi,
any update on this issue?
Hello,
We have released gemini-2.5-flash-native-audio-preview-09-2025, an improved version of the model. Could you please try this and check if it resolves your issue? Any additional feedback or insights from your side would be greatly appreciated.
Thank you for your patience.
I have checked, and we still receive cut-offs. Now, it tries to recover, but the issue is still very noticeable. It is the same problem; it “just” tries to force through. This still makes it unusable.
Hello,
These insights are very valuable for us in driving further improvements. Thank you for taking the time to share them. If you have any additional feedback, please feel free to let us know we truly appreciate it and will use it to continue improving.
Hi @Lalit_Kumar,
It’s going much better, I’ve done entire sessions without any stop. But, anyway, it still happens.
Also, It doesn’t always recover. Additionally, recovery may overlap with Voice Activity Detection
.
My use case is an assistant who provides customer service. It can happen that the assistant stops, the user asks if the assistant is still online, but in the meantime the assistant’s voice comes back and completes what it was saying. In the meantime the last user’s request if the assistant is still online
has arrived, the assistant says he’s still online, etc… . The whole conversation becomes a mess. Maybe the cure is worse than the disease
A sure source of stops is when the model has difficulty understanding what the user is saying. For example, in our application, the assistant might need to ask for the user’s first and last name and phone number. Once the user has provided them, the assistant must repeat them and ask for confirmation if they are correct.
In my tests I used my old landline number as a phone number, which was ....00....
(the international prefix is omitted). The double zero 00
in the middle causes the model to have serious problems. When the assistant asks for confirmation, at least 70% of the time it repeats the number with an extra zero, a third zero, like .....00
, and then stops. The user says I can’t hear you anymore and the assistant starts repeating the number with the same error, the third zero, and then stops. And this loop continues until the session is forcibly terminated. What surprised me is that, unlike other problems that are random, since AI is a statistical machine, this behavior is almost systematic.
Please note that our application is in Italian.
If you need it, I have the transcriptions and audio recordings of the sessions. They’re obviously in Italian. There are no privacy or GDPR concerns. We’re still in the testing phase.
The positive aspect of this update are:
- The
token generation suddenly stops
issue is less frequent. - The quality of voice tone and speech has improved dramatically. It’s much less robotic now. It has much more imagination in finding words, and this makes it very similar to a human assistant. Is a huge step forward.
- Function calls now work as expected, which is also a very welcome improvement.
Thank you for your valuable work.
Ciao