Gemini 2.5 timestamp references for start and end in the prompt are being ignored

From the docs for “audio understanding”, I have used the timestamp references to transcribe portions of a large audio file using Gemini 2.0 and it seems to work adequately. But when trying with Gemini 2.5 (flash or pro), the timestamp reference for start and end seem to be completely ignored and Gemini responds with a transcription of the entire file. I would love to see if other folks are seeing this too, and if it is the case, is the documentation wrong for 2.5 and this feature has been removed for 2.5, or is this a bug in audio processing by 2.5?

I have instances where I want to process audio files in chunks and timestamp references were great for not having to split and upload separate files for processing. Like I said, this worked “alright” in 2.0 (not withstanding the output timestamp inaccuracies). I wanted to use 2.5 flash more as it seems to do much a much better job transcribing, and gives much more accurate timestamps.

I have mainly been testing this in Google AI Studio to try to get the prompt to work properly to no avail.

Reference to Docs: Audio understanding  |  Gemini API  |  Google AI for Developers

1 Like

Hi @warmbowski,

Welcome to the community!

Thank you for reporting this issue. I have observed that the timestamp reference for the start time is working as expected; however, the end time is not working. I will inform the team of this discrepancy.

I cannot get the start timestamp reference to work.

Sometimes, when I include timestamps in the prompt, like “Transcribe the audio from 45:00 to 60:00.”, it will start a time that isn’t the beginning, but it might start at audio around 24 minutes. Basically, it never starts or ends at the referenced time, and I cannot make out a rhyme or reason for the time it DOES choose to start or end. In addition to this, I just want to call out that since audio can be hours long, the format for start and end reference times should be hh:mm:ss.

This may be a secondary issue that I should maybe create a new topic for, but when I call out in the prompt that the format for outputted timestamps should be hh:mm:ss, it actually gives timestamps that represent mm:ss:millisecons. It does this consistently.

Let me know if you know of any better ways to prompt for timestamp format on transcripts with entries past the hour mark.