Troubleshooting broken audio with Gemini 2.5 TTS

I’m working with the Gemini 2.5 Flash/Pro Preview TTS model.

I followed the snippet from the documentation here:

It works perfectly on my local machine (MacBook Pro) when running through my API wrapper.

However, when I deploy it to my development server using Docker (into k8s service), the generated WAV file turns into pure noise (broken audio). I’m still using the same guide-provided function to write PCM data into a WAV file:

def wave_file(filename, pcm, channels=1, rate=24000, sample_width=2):
   with wave.open(filename, "wb") as wf:
      wf.setnchannels(channels)
      wf.setsampwidth(sample_width)
      wf.setframerate(rate)
      wf.writeframes(pcm)

Has anyone encountered this issue in a containerized environment, or knows what might cause the WAV output to break?

Hi @dio.dzaky,

“It works on my machine but not in another setup” is a classic problem.

Let’s bisect this issue:

The fact that you are getting “pure noise” is a strong clue that the raw audio data is being misinterpreted at a binary level.

Exact format of the audio data returned by the Gemini TTS API is:

Sample Rate: 24,000 Hz
Channels : 1 (Mono)
16 bit
channels=1

Reference doc

The problem almost certainly lies in how the raw PCM bytes are being handled or written within your Docker environment or how your program is configured to read this information.

Hope this helps identify the issue.

Happy coding :slight_smile: