Gemini Live API: token generation suddenly stops

Gianluca_Emaldi · July 9, 2025, 9:46am

Hi,
I’m building a real-time voice stream application using the Gemini Live API and the Google Gen AI Go SDK.
With both the gemini-live-2.5-flash-preview and gemini-2.5-flash-preview-native-audio-dialog models, token generation suddenly stops, leaving sentences unfinished.
It’s a random issue, but it still doesn’t make the application reliable.
I thought it was a problem in my code, but then I saw the same thing happening in AI Studio.
I tried to dig deeper and did some debugging, thinking the problem was related to Voice Activity Detection (VAD). Obviously I did the tests using headphones, to dispel any doubts about whether the model could interpret its own audio as VAD.
I monitored the interrupted parameter (response.server_content.interrupted in Python). But I found that when the model suddenly stops generating tokens, the interrupted parameter is always false and is only set to true if I actually interrupt the model intentionally. Also, the audio transcription is identical to the audio itself: it stops exactly where the audio left off.
I don’t know if it has an impact, but the language used in my application is Italian.
Do you have any advice for me, or is this a known issue due to the Gemini Live API’s Preview status?

Amey_Rathi · July 17, 2025, 12:16am

+1 facing the same issue. Often see gemini live abruptly stop mid sentence.

aljazdolenc · July 18, 2025, 5:40am

we noticited that it was cut off when ever token count was module of 50. 1 month ago and still no fix

Lalit_Kumar · July 18, 2025, 9:27am

Hello,

Could you please share your code with me so that I can try to reproduce your issue?

Gianluca_Emaldi · July 19, 2025, 12:58pm

Hello Lalit,
thanks for your feedback.

As other users stated in the post reported by @aljazdolenc, the problem also occurs in AI Studio. I also thought there was a problem with my code, but then I saw, as other users have reported, that it also occurs in AI Studio. So I think you could run some tests in AI Studio.

I’d like to point out that the language of my application is Italian. Perhaps the preview state of the Live models is more stable in English than in other languages.
In any case, since the code for the application I’m developing, written in Go, is a bit complex and currently has over 3,000 lines of code, I’m not sure what to share. I could share one of the main web handlers and the JavaScript part for the client browser. But that would still make for a very long post. Perhaps it’ll be easier if I share the Git repository.
Anyway, the web handler is heavily based on the example Google provides for the Go SDK. The client’s JavaScript code, instead, is a rework of the example code that Google provides with the Go SDK. I actually thought the token generation stopping might be caused by the use of deprecated code for audio streaming in the browser in Google’s provided Javascript example. So I modified it to use Web Workers. But that didn’t solve the problem. That’s when I thought I’d use AI Studio to see if it was actually a problem with my code or not.

I look forward to your feedback.

Thanks again

Ciao

Lalit_Kumar · July 22, 2025, 9:27am

Hello,

We have escalated your issue to our internal team and we will get back to you at earliest.

Aaron_H · July 23, 2025, 1:10pm

I’m having a similar issue with trying to have it do French first then English in the same response.

simc · July 25, 2025, 9:04am

Sadly it’s pretty unusable for me right now hope there will be a fix soon

Gianluca_Emaldi · August 28, 2025, 9:13am

Hi,
any update on this issue?

Lalit_Kumar · September 24, 2025, 6:22am

Hello,

We have released gemini-2.5-flash-native-audio-preview-09-2025, an improved version of the model. Could you please try this and check if it resolves your issue? Any additional feedback or insights from your side would be greatly appreciated.

Thank you for your patience.

aljazdolenc · September 24, 2025, 2:10pm

I have checked, and we still receive cut-offs. Now, it tries to recover, but the issue is still very noticeable. It is the same problem; it “just” tries to force through. This still makes it unusable.

Lalit_Kumar · September 26, 2025, 6:04am

Hello,

These insights are very valuable for us in driving further improvements. Thank you for taking the time to share them. If you have any additional feedback, please feel free to let us know we truly appreciate it and will use it to continue improving.

Gianluca_Emaldi · September 26, 2025, 6:28am

Hi @Lalit_Kumar,
It’s going much better, I’ve done entire sessions without any stop. But, anyway, it still happens.
Also, It doesn’t always recover. Additionally, recovery may overlap with Voice Activity Detection.
My use case is an assistant who provides customer service. It can happen that the assistant stops, the user asks if the assistant is still online, but in the meantime the assistant’s voice comes back and completes what it was saying. In the meantime the last user’s request if the assistant is still online has arrived, the assistant says he’s still online, etc… . The whole conversation becomes a mess. Maybe the cure is worse than the disease

A sure source of stops is when the model has difficulty understanding what the user is saying. For example, in our application, the assistant might need to ask for the user’s first and last name and phone number. Once the user has provided them, the assistant must repeat them and ask for confirmation if they are correct.
In my tests I used my old landline number as a phone number, which was ....00.... (the international prefix is omitted). The double zero 00 in the middle causes the model to have serious problems. When the assistant asks for confirmation, at least 70% of the time it repeats the number with an extra zero, a third zero, like .....00, and then stops. The user says I can’t hear you anymore and the assistant starts repeating the number with the same error, the third zero, and then stops. And this loop continues until the session is forcibly terminated. What surprised me is that, unlike other problems that are random, since AI is a statistical machine, this behavior is almost systematic.
Please note that our application is in Italian.
If you need it, I have the transcriptions and audio recordings of the sessions. They’re obviously in Italian. There are no privacy or GDPR concerns. We’re still in the testing phase.

The positive aspect of this update are:

The token generation suddenly stops issue is less frequent.
The quality of voice tone and speech has improved dramatically. It’s much less robotic now. It has much more imagination in finding words, and this makes it very similar to a human assistant. Is a huge step forward.
Function calls now work as expected, which is also a very welcome improvement.

Thank you for your valuable work.
Ciao

Lalit_Kumar · October 8, 2025, 6:56am

Thank you very much for the feedback. We have shared the feedback with the concerned team for further improvements.

Topic		Replies	Views
Gemini 2.5 Native Dialog audio problems Gemini API ai-studio , audio , gemini-flash-2-5	29	1365	October 21, 2025
2.5 flash audio native - output broken in DE Gemini API models	8	360	October 18, 2025
Suddenly the Gemini Live API stopped understanding input audio Gemini API gemini , generative-ai , audio , live-streaming , gemini-flash-2-5	5	248	September 11, 2025
Gemini flash 2.0 API sometimes would stop outputting (paused) Gemini API feedback , prompt	18	1567	March 6, 2025
Live API Input Transcription Stops After ~100 Chunks Despite Audio Processing Continuing Google AI Studio bug , audio	2	55	October 21, 2025

Gemini Live API: token generation suddenly stops

Related topics