I wish Gemini TTS able to support response format in AAC or M4A

Currently, the Gemini text-to-speech (TTS) interface (gemini-2.5-pro-preview-tts) “only” supports Linear-PCM encapsulated in a WAV container as the output format. The official documentation does not provide configuration parameters to directly encode speech into AAC/M4A.

I really wish it could support AAC/M4A format in the response. Thanks!

23 Likes

“Modified by moderator” It causes me to be very impolite to gemmy sometimes. It really would be nice to drive to work and converse about what’s on your mind, could take the place of lying in bed way too late at night reading Wikipedia articles like a few years ago.

In general though, yeah totally agree and support.

Hi @Will , Welcome to the forum.

Thank you for the feedback, we have noted your request for AAC/M4A output support in Gemini TTS. We will share it with the team for consideration.

Surely you can just convert WAV=>AAC/M4A, right?

Frustrating, I know,
but single-step solution. :+1:

I use mobile devices only - Android/Samsung
(phone + tablet + BT keyboard)

No PC… Sorry… :cowboy_hat_face:

Here’s my steps for converting WAV=>MP3:

  1. MiXplorer (my main file browser for years :+1:)
  2. Select .wav file
  3. Rename + .wav [file extension] to “.mp3”

That’s it.

Works too with images generated as
PNG=>JPG

MP3 & JPG are more “user-friendly” for
sharing or uploading in my experience.

WordPress, Email, Social Media,
Google Workspace…

Maybe Gemini + NotebookLM

  • AI Studio + Labs will eventually
    Get to AAC or exporting others.

But until then, we can try & find
The most efficient work-arounds
to get the desired end result…

Try searching:
“open-source AAC converter”

Takes no longer than hitting
Generate Deep Dive :slightly_smiling_face:

1 Like