Timestamp generation (Forced Alignment) on 2.0 production models is still broken

rlev · April 14, 2025, 10:16pm

Timestamp generation for audio files (in all supported formats) is still broken for Gemini 2.0 production models (2.0 flash and 2.0-flash-lite), and can be verified in AI studio by anyone.

This issue has been brought up several times and there is still no acknowledgement from anyone at Google.

Right now the two 2.0 models in preview: Gemini 2.0 Flash (Image Generation) Experimental and Gemini 2.0 Flash Thinking Experimental 01-2 both produce accurate timestamps however they are obviously not suited for production.

Is anyone at Google aware of this issue and are there any plans to release a 2.0 GA model that produces accurate timestamps or will this pattern of taking away preview functionality upon general release continue?

Sangeetha_Jana · April 15, 2025, 8:52am

Hey @rlev
I tried uploading a mp3 file in AI Studio and referred to a timestamp and it worked perfectly fine. Please let me know the file format used by you which broke the model.
Below are some of the observations.
The generated output is inaccurate and inconsistent in gemini-2.0-flash and gemini-2.0-flash-lite compared to gemini-2.5-pro-preview-03-25 and gemini-2.0-flash-exp-image-generation models.

rlev · April 15, 2025, 11:15am

Hi @Sangeetha_Jana, thank you for the response and looking into this issue.

To clarify: when I say that timestamp generation is “broken” what I mean is exactly what you observed: referring to timestamps is returning wildly inaccurate/inconsistent timestamps in generally available versions of the gemini 2.0 models (2.0-flash and 2.0-flash-lite).

Currently the only generally available model with accurate timestamp referral is gemini-1.5-flash-001 which will soon be retired.

I have tested mp3, AAC and WAV audio formats which all experience the same issue and I suspect it applies to all audio formats.

Right now it looks like this specific functionality is always degraded once the model goes from preview to GA. It would be nice to have some clarification whether this is considered a bug and are there any plans to fix this.

Sangeetha_Jana · April 16, 2025, 5:03am

Hey @rlev
We’ve informed the team about the issue. We’ll keep you updated as soon as we have more information.
Appreciate your patience!

Avi_Charkham · April 26, 2025, 8:26am

Is there any way to track progres on this. Right now I have no alternative but to work with Whisper insted and I reall prefer working with Gemini

rlev · April 28, 2025, 8:45pm

Hi @Sangeetha_Jana Any updates on the issue?

I noticed that pricing has been updated for the gemini 2.0-flash and 2.5-flash preview models to specify audio input pricing, is this related?

Topic		Replies	Views
Audio timestamp accuracy issue in Gemini 2.0 GA models Gemini API help_request , gemini-20	0	152	March 14, 2025
Timestamp Generation (Forced Alignment) on 2.0-Pro-Exp Gemini API audio	5	205	March 3, 2025
Gemini Flash 2.0 audio transcription timestamps incorrect Gemini API audio	4	377	March 27, 2025
Gemini Pro Timestamp Accuracy Issues in Audio Transcription Gemini API gemini-15 , api	9	442	March 27, 2025
Gemini 2.0 flash lite timestamp hallucinations for audio but not video since going into GA Gemini API gemini-api , gemini-flash , gemini-20	0	98	February 27, 2025

Timestamp generation (Forced Alignment) on 2.0 production models is still broken

Related topics