The Stream Realtime tab in the Google AI Studio is able to display both the audio and the transcript of the model’s response, but the multimodal live API demo on GitHub that does not display the transcript along with the audio response. How do I adjust the code to make it display the transcript too?…

I have the same question. This is what I get from API: interface GenerativeContentBlob { mimeType: string; data: string; } So it seems API does not return transcript, only audio. I think of sending audio to some audio-to-text API to get transcript but I don’t like this idea. OpenAI’s API jus…

Hello facing same issue did you get any fix or any method from which can we achieve audio and text simuntaneously ?? Please help

The transcript capability is actually built into the Live API. You’ll need to add `outputAudioTranscription: {}` to your session config: ``` config = { "response_modalities": \["AUDIO"\], "output_audio_transcription": {} } ``` Then in your message handler, check for `message.serverContent?.ou…

Realtime Transcription in Multimodal Live API

Gemini API

Shubham_Sahu May 6, 2025, 2:20pm 7

It gives you the transcript. You can find the solution here: Will it be possible to receive text and audio data in the multimodal API?

Topic		Replies	Views
Will it be possible to receive text and audio data in the multimodal API? Gemini API models , gemini-api	13	1051	July 22, 2025
Audio transcript in Gemini Live API not really working Gemini API api , gemini-api	5	250	November 25, 2025
Why in Gemini Live API with Audio Modality its Transcription is not available in response Gemini API audio , live-streaming	5	304	August 15, 2025
outputAudioTranscription NOT WORKING WHEN [Modality.AUDIO] Gemini API api , models , gemini-flash	2	295	June 19, 2025
Gemini Live API: print the transcripts Gemini API api , gemini	3	173	July 17, 2025

Realtime Transcription in Multimodal Live API

Related topics