Issues with gemini-tts-2.5-pro in AI Studio (blank audio, voice drift, pacing changes)

Abid_Raza · March 16, 2026, 2:28pm

Hello,

I am currently evaluating Gemini TTS for generating long-form narration and I am planning to use Google Cloud for production if the results are stable.

Before subscribing to the Cloud API, I have been testing gemini-tts-2.5-pro in AI Studio for the past few days. However, I am consistently encountering several issues that make the generated audio unreliable.

The main problems I am experiencing are:

Long blank/noise segments
The model often begins speaking normally for the first few sentences but then produces around 10 minutes of blank noise. This happens with multiple voices, but it occurs most frequently with the Fenrir voice (which I chose to use for my usecase).
Voice quality degradation over time
For audio around 4–5 minutes long, the voice sounds very natural at the beginning. However, as the audio progresses, the voice gradually becomes metallic or robotic and sometimes develops an echo-like effect. And also includes background noises somtimes.
Automatic pacing changes
I try to generate narration at a slow-to-medium speaking pace. The audio usually starts at the correct speed, but as the narration continues, the speaking speed gradually increases without any instruction.

Because of these issues, I have not been able to generate a complete narration without problems.

Transcript length: I tried input text length of 50 words to 1000 words. So far wasn’t able to generate a script with more than 500 words and for 500 words script the audio faces the second and third issues (voice degradation and pacing change).

Temperature: I tried the default temperature and values as low as 0.5 but all issues persist.

Instruction prompt: Used instruction prompt as simple as a single line instruction to a complex Director’s notes prompt. Also tried without any instruction prompt (with just the transcript). But issue persists.

This issue wasn’t so consistent before 15h March (although, I only tried small scripts till that time and they were fine). But after 15th March suddenly it became worse. If I generate 10 audios, only one would be acceptable (that too with compromise). I try to keep the script small so the audio doesn’t exceed 5 mins.

My main question is:

Are these issues specific to AI Studio, or should I expect similar behavior when using the Gemini TTS model through the Google Cloud API (Vertex AI / Gemini API)?

When can we expect a stable solution?

If anyone has experience generating longer narration audio with gemini-tts-2.5-pro, I would appreciate any guidance or best practices that help avoid these problems.

Thank you.

Topic		Replies	Views
Inconsistent Audio Output with Gemini 2.5 Pro Preview TTS Google AI Studio ai-studio , gemini , audio	24	2534	February 20, 2026
Gemini TTS API Static Noise Google AI Studio ai-studio , models	0	14	March 31, 2026
Gemini 2.5 Pro Preview TTS: Inconsistent Voice and Tone Output Google AI Studio audio , gemini-2-5	1	234	November 19, 2025
TTS audio generation background noise Google AI Studio gemini-flash , gemini_25_pro	13	808	March 28, 2026
Persistent Noise in TTS Audio Generation Google AI Studio ai-studio , text	16	1361	March 28, 2026

Issues with gemini-tts-2.5-pro in AI Studio (blank audio, voice drift, pacing changes)

Related topics