Persistent Noise in TTS Audio Generation

I’m experiencing a persistent issue with the TTS (Text-to-Speech) functionality of the Gemini API. My TTS generation, which previously worked perfectly, now consistently produces a low-level hissing or static noise that is present in all generated audio.

This noise is particularly noticeable when the voice is speaking and is almost completely absent during silent pauses. This suggests the issue is an artifact of the voice generation itself, rather than simple background noise.

I’ve been working with this API for a while, and the problem appeared suddenly around September 12-13, 2025.

7 Likes

I had the same problem on September 18th.
Were you able to fix it?

2 Likes

well idk how to fix it cuz i can’t write code but i’m using auphonic to fix it however it takes more time

2 Likes

I have this hissing for quite a while - I posted on X and tried tagging - I found out that the longer the generation is lasting the more the hissing gets in there. Seems like a model issue as it persists on all voices.
Found no way to solve it…

Same thing here. This issue is really annoying. I used to create high quality content but now the problem persists and quality is really poor. I hope they fix it anytime soon.

1 Like

Hi guys @Aizen_Sosuke @Piotr_Jarecki @Rafael_Silva @Jhonny_Chambers
Wanted to reproduce from my end . could you please confirm if this is happening with both gemini-2.5-flash-preview-tts and gemini-2.5-pro-preview-tts? Additionally, does this occur with all audio lengths, and if so, please provide the token count at which the hissing sound starts. Any additional details you can provide about the conditions under which this occurs would be very helpful
Thank you

1 Like

@Pannaga_J yes, both gemini-2.5-flash-preview-tts and gemini-2.5-pro-preview-tts seem to have the same issue. I create short length audios, and the hissing has always been there since Sept 15. Thanks!

Thanks for flagging this. We have updated the concerned team about it .

Hello guys, any recent updates on this issue?

Has anybody (user / developer) been able to find a solution?

Thanks!

Same promblem for me. It impossible to use because of noise. Is there anybody solved this?

1 Like

Nothing so far. Unfortunately.

The issue is still there. The voice gets robotic with static noises the longer the audio is. The voice loses it’s consistency. Try generating something that is above 1k words.

1 Like

Still the same, maybe that’s why the model is still a preview? Not final yet?

1 Like

2026 and the problem is still there. The hissing, slightly metallic sound can be heard about 60% from the start. Sometimes if you refresh the page and re-create the audio, the subsequent audio does not have the hissing sound. Sometimes it persists and you have to close and open the browser. Even then it words only sometimes.
If you paste a long block of text, say 2300 words (14 - 15 minutes), it gives you only 10:55min of audio, so you have to paste the remaining block of text again to create the remaining 4 to 5 minutes of audio. The hissing can be heard in both clips, the long one and short one.

I’m surprised Google still hasn’t solved this issue.

1 Like

Yes, a similar conversation here - I’ve given up on the model at this point… : Metallic sounds using gemini-2.5-flash-preview-tts

Duration for the output audio Approximately 655 seconds. If the input text results in the audio exceeding 655 seconds, the audio is truncated.