Audio Output with Gemini 2.5 Pro Preview TTS is TOTALLY random

As the title says, I have 0 consistency with the audio output. I’m using this with API, but doesn’t matter, exact same thing happens in AI studio.

Doesn’t matter if it’s a long audio or a short one, inconsistency is everywhere. Sounds like totally different people.

Several times I got a female voice, even though male voice was set.

This is beyong crazy level of inconsistency. It is impossible to break the text into pieces and generate in parts, cause you will hear different voice, character, speed. Literally everything is incosistent.

For a lot of projects, this tool is currently unusable. I hope things will change in the future.

Hi @aroshidze
I am trying to replicate the issue, could you please provide the following details: Was the setup for single-speaker or multi-speaker use, which specific voice and language were selected? Additionally, are there any other required steps for reproduction?
It would be helpful if you can share the two inconsistent audio files you generated?

Hi @Pannaga_J

I was using single speaker, English language, I tried different voices but the last one was Fenrir.

I can’t share the voices because they were temporarily generated inside an app I was developing.

Believe me, you don’t even need to hear them, the difference was extreme. As I said in my post, there was once a female sound, instead of the male one.

I tried all possible audio lengths, starting from a few seconds. ALL of them were inconsistent. The only solution I found was to generate one gigantic audio and split it manually. Because creating in parts means just creating different voices, even though the settings were exactly the same.