I tried out both gemini 2.5 flash tts, and 2.5 pro tts. The voice is not consistent across long sentences, sometimes even though I provided male voice through PrebuiltVoiceConfig, but that was produced using female voice.
I am trying through gemini https://generativelanguage.googleapis.com endpoint and gemini-2.5-pro-preview-tts/gemini-2.5-flash-preview-tts models.
I see many people complained about the similar issue. Is there any workaround? Like through vertex endpoints or other api to produce consistent voice production?
Any help would be much appreciated.