Realtime Transcription in Multimodal Live API

It gives you the transcript. You can find the solution here: Will it be possible to receive text and audio data in the multimodal API?