Hi,
I’m building a real-time voice stream application using the Gemini Live API and the Google Gen AI Go SDK.
With both the gemini-live-2.5-flash-preview and gemini-2.5-flash-preview-native-audio-dialog models, token generation suddenly stops, leaving sentences unfinished.
It’s a random issue, but it still doesn’t make the application reliable.
I thought it was a problem in my code, but then I saw the same thing happening in AI Studio.
I tried to dig deeper and did some debugging, thinking the problem was related to Voice Activity Detection (VAD). Obviously I did the tests using headphones, to dispel any doubts about whether the model could interpret its own audio as VAD.
I monitored the interrupted
parameter (response.server_content.interrupted
in Python). But I found that when the model suddenly stops generating tokens, the interrupted
parameter is always false
and is only set to true
if I actually interrupt the model intentionally. Also, the audio transcription is identical to the audio itself: it stops exactly where the audio left off.
I don’t know if it has an impact, but the language used in my application is Italian.
Do you have any advice for me, or is this a known issue due to the Gemini Live API’s Preview status?
+1 facing the same issue. Often see gemini live abruptly stop mid sentence.
we noticited that it was cut off when ever token count was module of 50. 1 month ago and still no fix
Hello,
Could you please share your code with me so that I can try to reproduce your issue?
Hello Lalit,
thanks for your feedback.
As other users stated in the post reported by @aljazdolenc, the problem also occurs in AI Studio. I also thought there was a problem with my code, but then I saw, as other users have reported, that it also occurs in AI Studio. So I think you could run some tests in AI Studio.
I’d like to point out that the language of my application is Italian. Perhaps the preview state of the Live models is more stable in English than in other languages.
In any case, since the code for the application I’m developing, written in Go, is a bit complex and currently has over 3,000 lines of code, I’m not sure what to share. I could share one of the main web handlers and the JavaScript part for the client browser. But that would still make for a very long post. Perhaps it’ll be easier if I share the Git repository.
Anyway, the web handler is heavily based on the example Google provides for the Go SDK. The client’s JavaScript code, instead, is a rework of the example code that Google provides with the Go SDK. I actually thought the token generation stopping might be caused by the use of deprecated code for audio streaming in the browser in Google’s provided Javascript example. So I modified it to use Web Workers. But that didn’t solve the problem. That’s when I thought I’d use AI Studio to see if it was actually a problem with my code or not.
I look forward to your feedback.
Thanks again
Ciao
Hello,
We have escalated your issue to our internal team and we will get back to you at earliest.
I’m having a similar issue with trying to have it do French first then English in the same response.
Sadly it’s pretty unusable for me right now hope there will be a fix soon