Live API -- support for mulaw (g711_ulaw) input/output?

MB_AST · May 29, 2025, 7:10pm

Hi – the gemini-2.0-flash-live “family” of models has been great.

It would be super useful if, in addition to PCM16 i/o, these streaming models supported mulaw / g711_ulaw. Is that possible?

My application pipes data to/from a phone system and currently I have to re-encode to/from PCM16 on the fly. This works but it’s choppy. The OpenAI realtime system supports g711_ulaw i/o and it’s much smoother – but I’d rather stick with gemini

chunduriv · May 29, 2025, 8:04pm

Hi @MB_AST,

Thank you for your valuable suggestions. We appreciate your input and will be sure to share this with the team.

Topic		Replies	Views
Will it be possible to receive text and audio data in the multimodal API? Gemini API models , gemini-api	11	658	May 6, 2025
There is Lag when using the MultiModal API from the open source code Gemini API api , models	1	78	February 25, 2025
Is there any near future plans to have native WebRTC support in the Gemini 2.0 flash live multimodal API servers? Gemini API api , feature_request	2	257	February 25, 2025
Support for adding functions dynamically during a stream just like openai Gemini API live-streaming	0	32	April 9, 2025
Problems with Live API Audio Streaming and Function Responses Gemini API api	0	152	March 30, 2025

Live API -- support for mulaw (g711_ulaw) input/output?

Related topics