I’m using this Gemini Live API tutorial now: cookbook/quickstarts/Get_started_LiveAPI.py at c063f0dbf13aa0da5f4b75931e174f9e02f16bce · google-gemini/cookbook · GitHub
Is there a way, maybe a flag to set, to also print the transcript of our speech too?
AWS Nova Sonic shows that by default. I need something similar here too.
Hey @Tina_Jasmine , you can pass the input_audio_transcription parameter in the config to retrieve the input audio transcription.
config = {
"response_modalities": ["TEXT"],
"input_audio_transcription": {},
}
Please refer to the documentation for more details.
Thanks.
But when I add it, voice doesnt work anymore. How can I keep both?
Hey, I just checked on my end and it worked fine, You just needed to update the receive_audio function.
async def receive_audio(self):
"Background task to reads from the websocket and write pcm chunks to the output queue"
while True:
turn = self.session.receive()
async for response in turn:
if data := response.data:
self.audio_in_queue.put_nowait(data)
continue
if text := response.text:
print(text, end="")
if response.server_content.output_transcription:
print("Transcript:", response.server_content.output_transcription.text)
if response.server_content.input_transcription:
print('Transcript:', response.server_content.input_transcription.text)
while not self.audio_in_queue.empty():
self.audio_in_queue.get_nowait()