Gemini 3.1 Flash Live exhibits a serious set of issues on TTS requests, which become progressively worse the longer the audio is. It becomes noticable around the 1 minute mark, often before, and quickly degrades from that point onwards. The volume begins to drop, the quality of the audio and voice begins to drop and the voice itself progressively changes. These things all happen in combination to terrible effect. This is not an isolated issue either it seems, as I have seen other reports of this online. This issue has been consistent for us since launch on every TTS request.
I’m really surprised Google has released the model in this state. I also noticed the temperature setting is bugged when set below 0.6 - the TTS is never returned.
Is Google aware of these issues? Any ETA on a fix being released?
I have attached an example of the issue below.
I can reproduce a closely related issue with models/gemini-3.1-flash-live-preview using native audio TTS through @google/genai.
In our case, the strongest trigger appears to be explicitly setting temperature: 0 in the Live API config. The failure is visible before playback: the raw PCM output either
becomes oversized/runaway or the session closes with 1011 Resource exhausted.
Evidence from local A/B runs with the same Belarusian TTS prompt/text:
-
Product-style path with explicit temperature: 0:
· 1/3 oversized failures in one baseline run.
· Failing sample: 8,911,204 raw PCM bytes, about 185.65s at 24 kHz.
· Close code: 1011 Resource exhausted.
-
Manual baseline with explicit temperature: 0:
· 5 attempts: 3 OK, 1 no-audio, 1 oversized/error.
· Oversized sample: 10,930,564 raw PCM bytes, about 227.7s at 24 kHz.
· Close code: 1011 Resource exhausted.
-
Literal AI Studio-style control without explicit temperature:
· 5/5 clean.
-
Temperature isolation using literal AI Studio-style session shape:
· Without explicit temperature: 3/3 clean.
· With temperature: 0: 1/3 oversized, 12,616,804 raw PCM bytes, about 262.85s at 24 kHz, close 1011 Resource exhausted.
-
After removing explicit temperature from our production TTS path:
· 20/20 product-path attempts clean.
· 0 oversized/no-audio/1011 failures.
This looks like a Gemini Live native-audio TTS backend issue with low/zero explicit temperature rather than a playback issue, because the raw PCM is already pathological before
speaker playback starts.
Can Google confirm whether temperature: 0 is supported for Gemini 3.1 Flash Live native-audio output? If it is unsupported or unstable, it would help if the API/SDK rejected it,
clamped it, or documented the safe range.