Gemini-tts in Python seems to revert to v1 after a while? "400 Prompt is only supported for Gemini TTS"

I just ran the example code from the document (https://docs.cloud.google.com/text-to-speech/docs/gemini-tts#perform_synchronous_single-speaker_synthesis), and it was fine a couple minutes ago. Then when I installed some other audio processing pip packages and ran the code again, it gave me this 400 error, and I noticed that the exception stacks somehow tracked back to `texttospeech_v1` package. I checked my text-to-speech API quota and nothing is exceeded.


The above exception was the direct cause of the following exception:

Traceback (most recent call last):

File "generate\_audio.py", line 73, in <module>

synthesize\_freeform(prompt\_str, words, speaker\_name, output\_filepath=output\_path\_root+speaker\_name+"/")

\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^

File "generate\_audio.py", line 44, in synthesize\_freeform

response = client.synthesize\_speech(

input=synthesis\_input, voice=voice, audio\_config=audio\_config

)

File "$HOME$/miniconda3/lib/python3.13/site-packages/google/cloud/texttospeech\_v1/services/text\_to\_speech/client.py", line 946, in synthesize\_speech

response = rpc(

request,

...<2 lines>...

metadata=metadata,

)

File "$HOME$/miniconda3/lib/python3.13/site-packages/google/api\_core/gapic\_v1/method.py", line 131, in \_\_call\_\_

return wrapped\_func(\*args, \*\*kwargs)

File "$HOME$/miniconda3/lib/python3.13/site-packages/google/api\_core/grpc\_helpers.py", line 77, in error\_remapped\_callable

raise exceptions.from\_grpc\_error(exc) from exc

google.api\_core.exceptions.InvalidArgument: 400 Prompt is only supported for Gemini TTS.  
  • Python version: 3.13
  • google-cloud-texttospeech version: 2.33.0
  • Location: Canada

Nevermind, I just listened to the previously generated audio clips and discovered that each time the generation from the same speaker actually sounds very different. I’m trying to generate sounds for a cognitive study and the voice qualities and speaking styles are not consistent for the same speaker, so this will not do. You need some language scientists to join your team.

Hi @aaakkk420
Have you attempted to use the Gemini 2.5 Pro and Flash TTS models for this specific use case? If you run into any challenges with either model, we would be happy to help.