Periodically, when generating audio files from a small text (e.g., 500 characters), the resulting audio is approximately 10 minutes and 55 seconds long, where after a few spoken words there is silence. When requesting again, normal audio is produced. I tried using Python, Ruby, and cURL. As I understand it, the duration of 10 minutes and 55 seconds corresponds to the 16,000 token limit, after which Google forcibly terminates the generation. Has anyone encountered this?
Hi @user2794,
Welcome to the Google AI Forum!
![]()
Thanks for reporting the issue.
Can you share steps to reproduce with the input prompts and files information..
It helps us to reproduce the issue and report accordingly.
Hello, here is the request I’m sending to Gemini: https://pastebin.com/26jWahmW (text ~750 characters)
And this is the response I’m getting from Gemini: https://drive.google.com/file/d/1E3CTbhsHcmDl2GPI-YWygIx6kpKV2hac/view?usp=sharing (it’s too large for Pastebin).
The resulting audio file is 10 minutes and 55 seconds long, but the sound disappears after 17 seconds, followed by silence.
Yes this is a persistent problem. I actually switched to elevenLabs because I couldn’t solve it and it was driving me crazy. 10 minutes and 55 seconds regularly and that at other times the exact same code, same script same text would work so it doesn’t seem like it has anything to do with the code or the input text
Is there any update on this issue? I’m having the same problem with gemini-2.5-pro-preview-tts. I wonder if all those minutes of silence are billed as useful output tokens…