Gemini-3.1-flash-tts-preview: streamGenerateContent truncates audio + finishReason: OTHER past ~60s, while generateContent (non-streaming) works

Hendrik2 · June 2, 2026, 11:11pm

Summary: On gemini-3.1-flash-tts-preview, the SSE streaming endpoint (:streamGenerateContent?alt=sse) intermittently returns partial audio + finishReason: OTHER (HTTP 200) once the generation exceeds ~60s of audio. The exact same prompt through non-streaming :generateContent returns the full audio with finishReason: STOP every time. This bills AUDIO tokens for unusable output, with no error surfaced to the client.

Repro (raw REST, single-speaker, fr-FR voice “Leda”). Same request body, only the endpoint differs:

text	~audio	:streamGenerateContent (3 trials)	:generateContent
50 words	~20s	STOP / STOP / STOP	STOP
100 words	~40s	STOP / STOP / STOP	STOP
150 words	~57s	STOP / STOP / STOP	STOP
200 words	~70s	OTHER / STOP / STOP	STOP
250 words	~89s	OTHER / STOP / STOP	STOP
300 words	~106s	OTHER / OTHER / OTHER	STOP
350+ words	~125s	OTHER / OTHER / OTHER	STOP (full ~136s)

Streaming reliably truncates once the audio passes ~60-70s; non-streaming has no such cliff. Failures arrive as one/a few PCM chunks then finishReason: OTHER, HTTP 200.

The confusing part: the Gemini API TTS docs state “TTS does not support streaming” under Limitations, yet :streamGenerateContent accepts the request, returns 200, and bills AUDIO tokens, just with truncated output. What is the supported production path for long-form streaming TTS?

Environment: model gemini-3.1-flash-tts-preview; reproduced on both Vertex AI (generateContentStream) and the Gemini Developer API (streamGenerateContent?alt=sse); single-speaker; temperature omitted and 0.6 both reproduce.

Impact: production museum audio-guide product with long-form narration. We cannot ship the streaming path. Non-streaming works but a single ~136s generation takes ~77s wall-time, too slow for interactive playback. So today neither path is viable for >~1 min narration.

Related reports:

Could the Gemini API / TTS team confirm whether streaming TTS is supported and route the truncation? Happy to share full request/response payloads and responseIds.

philschmid · June 23, 2026, 11:24am

Thanks for catching. We were able to reproduce and will work on a fix.

Topic		Replies	Views
Gemini 3.1 Flash TTS SSE sometimes returns exactly 20s / 1,280,000 base64 chars and truncated audio Gemini API api , gemini-api , gemini , gemini-flash	0	87	May 14, 2026
Gemini TTS preview returns HTTP 200 with usageMetadata but no audio payload Gemini API bug , models	0	69	May 19, 2026
Gemini TTS Multi-Speaker Mode: 7 Critical Bugs After 3 Weeks in Production (finishReason 'OTHER', Truncation, Voice Swapping, Hallucinated Lines) Gemini API gemini	6	413	June 23, 2026
Gemini 2.5 Flash TTS streaming? Gemini API api , audio	12	1396	February 25, 2026
Gemini 3.1 Flash Live - Voice slowly changing, massive audio quality + volume dropping on TTS requests longer than ~1 minute Gemini API bug	2	397	June 11, 2026

Gemini-3.1-flash-tts-preview: streamGenerateContent truncates audio + finishReason: OTHER past ~60s, while generateContent (non-streaming) works

Related topics