Audio timestamp accuracy issue in Gemini 2.0 GA models

rlev · March 14, 2025, 10:35am

Hi,

As previously reported by users, audio timestamp accuracy in Gemini 2.0 models has been unreliable since transitioning from preview to GA.

Through testing, I have found that if the same audio clip (tested with MP3 files) is converted into a video format (tested with MP4 files with a solid background), the timestamps are accurate. This suggests the issue may be specific to how the model processes standalone audio files. As a workaround, this is not ideal since it comes with a 10x input token increase.

Currently, the gemini-2.0-flash-thinking-exp-01-21 model provides accurate audio timestamps, but I am concerned that this functionality might break again when moving to GA.

Is anyone at Google aware of this issue, and are there any plans to address it in future updates?

Topic		Replies	Views
Gemini 2.0 flash lite timestamp hallucinations for audio but not video since going into GA Gemini API gemini-api , gemini-flash , gemini-20	3	186	July 11, 2025
Call to update documentation for Audio Understanding (Refer to timestamps) Gemini API audio , gemini-20 , documentation	1	66	May 31, 2025
Timestamp generation (Forced Alignment) on 2.0 production models is still broken Gemini API models , audio	11	320	June 18, 2025
Gemini Flash 2.0 audio transcription timestamps incorrect Gemini API audio	4	606	March 27, 2025
Gemini Pro Timestamp Accuracy Issues in Audio Transcription Gemini API gemini-15 , api	9	640	March 27, 2025

Audio timestamp accuracy issue in Gemini 2.0 GA models

Related topics