In order to keep a chat log and storing it in my app data base in a kind of efficient fashion, it would be extremely helpful, if Gemini 2.0 could return me a transcription of the things I say in the Live Conversation mode, so that I can keep track of what was being said. Needless to say, I also still am waiting for the API to provide me the text together with the audio chunks Gemini 2.0 returns.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Realtime Transcription in Multimodal Live API | 1 | 204 | January 9, 2025 | |
Will it be possible to receive text and audio data in the multimodal API? | 7 | 455 | February 3, 2025 | |
Text to speech? | 3 | 451 | January 21, 2025 | |
Need for Modality Recomposition: Access to TTS and STT API required | 0 | 96 | December 24, 2024 | |
Live API Audio to Audio tool grounding | 1 | 68 | January 7, 2025 |