Gemini-tts in Python seems to revert to v1 after a while? "400 Prompt is only supported for Gemini TTS"

I just ran the example code from the document (https://docs.cloud.google.com/text-to-speech/docs/gemini-tts#perform_synchronous_single-speaker_synthesis), and it was fine a couple minutes ago. Then when I installed some other audio processing pip packages and ran the code again, it gave me this 400 error, and I noticed that the exception stacks somehow tracked back to `texttospeech_v1` package. I checked my text-to-speech API quota and nothing is exceeded.


The above exception was the direct cause of the following exception:

Traceback (most recent call last):

File "generate\_audio.py", line 73, in <module>

synthesize\_freeform(prompt\_str, words, speaker\_name, output\_filepath=output\_path\_root+speaker\_name+"/")

\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\~\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^\^

File "generate\_audio.py", line 44, in synthesize\_freeform

response = client.synthesize\_speech(

input=synthesis\_input, voice=voice, audio\_config=audio\_config

)

File "$HOME$/miniconda3/lib/python3.13/site-packages/google/cloud/texttospeech\_v1/services/text\_to\_speech/client.py", line 946, in synthesize\_speech

response = rpc(

request,

...<2 lines>...

metadata=metadata,

)

File "$HOME$/miniconda3/lib/python3.13/site-packages/google/api\_core/gapic\_v1/method.py", line 131, in \_\_call\_\_

return wrapped\_func(\*args, \*\*kwargs)

File "$HOME$/miniconda3/lib/python3.13/site-packages/google/api\_core/grpc\_helpers.py", line 77, in error\_remapped\_callable

raise exceptions.from\_grpc\_error(exc) from exc

google.api\_core.exceptions.InvalidArgument: 400 Prompt is only supported for Gemini TTS.  
  • Python version: 3.13
  • google-cloud-texttospeech version: 2.33.0
  • Location: Canada

Nevermind, I just listened to the previously generated audio clips and discovered that each time the generation from the same speaker actually sounds very different. I’m trying to generate sounds for a cognitive study and the voice qualities and speaking styles are not consistent for the same speaker, so this will not do. You need some language scientists to join your team.