Troubleshooting incomplete transcriptions with Gemini Pro

Vaibhav_Sharma1 · October 7, 2025, 7:30am

Hi everyone,

I’m using the Gemini Pro model with the Google GenAI SDK for asynchronous audio transcription. When I upload and transcribe a small number of files, everything works great — I get accurate and complete transcriptions for each file.

However, when I upload and process multiple files, some of the files — usually the ones uploaded last — return incomplete or inaccurate transcriptions. The same files work fine when transcribed individually.

I’m using the official async SDK flow for transcription. My environment variables and model configuration are correctly set.

It seems like the issue may relate to:

Token limits or truncation during batch processing
Possible concurrency or rate-limiting behavior in the Gemini API
SDK handling of multiple parallel requests

Has anyone else run into similar problems with Gemini Pro or found best practices for handling batch audio transcription reliably?
Also, is there any guidance on optimal batch size or recommended concurrency limits for transcription workloads?

Shivam_Singh2 · October 8, 2025, 9:21am

Hi @Vaibhav_Sharma1
Welcome to the forum!!!

Could you please share the complete payload details along with the steps to reproduce the issue? This will help us investigate it more accurately and provide a more precise response.

Vaibhav_Sharma1 · October 10, 2025, 6:19am

Hi @Shivam_Singh2
Thank you for the warm welcome!

I am sending the following payload.
{
“audio_file_path”: “s3://your-bucket/path/to/audio.mp3”,
“prompt”: "
Prompt
",
“generation_config”: {
“temperature”: 0.3,
“response_mime_type”: “application/json”,
“max_output_tokens”: “”
},
“model”: “gemini-2.5-pro”
}

Shivam_Singh2 · October 15, 2025, 5:43am

Hi @Vaibhav_Sharma1
Thank you for your patience.

We tested the behavior on our side using the same Gemini Pro model and the official async SDK, and were able to transcribe batches of 6–10 audio files successfully, without any truncation or accuracy issues.

To improve reliability on your end, we recommend limiting concurrent transcription requests to around 3–5 at a time, explicitly setting the max_output_tokens parameter (e.g., to 4096 or higher) to prevent output truncation, and adding retry logic with exponential backoff for any failed or incomplete responses.
Also, ensure that your audio files follow a consistent format (e.g., MP3 at 16kHz), and consider breaking up larger batches into smaller groups for better stability.

Topic		Replies	Views
How to get consistent Multi-Speaker Transcription output from Gemini 2.5 Pro? Gemini API api , audio , gemini-25	2	352	August 29, 2025
Transcribe text to text and vice versa, speech to speech and image to text in a flutter app using gemini Gemini API	15	748	May 20, 2024
Significant Difference in Response Quality between Google AI Studio and Gemini 2.5 Pro API (gemini-2.5-pro-03-25) Gemini API feedback , api , gemini-25 , gemini-2-5	7	783	June 4, 2025
Truncated responses despite being under limits Gemini API api , gemini-2-5	2	641	June 11, 2025
Resuming structured output after MAX_TOKENS cut-off Gemini API gemini-15	2	204	March 3, 2025

Troubleshooting incomplete transcriptions with Gemini Pro

Related topics