I tried gemini-2.5-fash-tts-preview and it sounds great in the languages I need. But the latency is too high for voice agent use cases.
I found that chirp 3 has the same voices, but the way they speak words and expressions used in them are completely different.
Is there a way to prompt the voice in chirp3 to sound similar to flash-tts. Only pronunciation and emphasis on words in a sentence seems to be a problem.
Hey @Shashwat_Aditya ,
Welcome to the community! You’re right, Chirp 3 and Gemini TTS can sound quite different even with similar voices. While there’s no direct way to match Gemini’s style in Chirp 3, you might try adjusting the input text with punctuation, emphasis markers, or SSML tags to guide pronunciation and tone. It’s not perfect, but it can help get closer to the sound you’re aiming for.
Thanks for Reaching us!