It gives you the transcript. You can find the solution here: Will it be possible to receive text and audio data in the multimodal API?
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Will it be possible to receive text and audio data in the multimodal API? | 11 | 599 | May 6, 2025 | |
Retrieving transcribed audio input prompt with reply | 0 | 122 | December 25, 2024 | |
Why is the multimodal live API so hard to use? | 0 | 82 | February 6, 2025 | |
Live with video and audio input API and docs | 1 | 195 | December 13, 2024 | |
There is Lag when using the MultiModal API from the open source code | 1 | 73 | February 25, 2025 |