[Gemini 2.5 Pro] Severe Timestamp/Timecode Jumping Issues in Video Transcription - Need Workarounds

Problem Summary
I’m experiencing consistent timecode jumping issues when using Gemini 2.5 Pro in Google AI Studio for video transcription and image description. Timestamps in the output don’t align with actual video content, making the transcriptions unusable for time-sensitive work.

Technical Details
Model: Gemini 2.5 Pro via Google AI Studio

Task: Audio transcription + image description of video files

File format: MP4 and MOV

Input length: Originally longer videos, now restricted to 25-minute segments

What I’ve Tried
Length restriction: Limited videos to 25 minutes maximum - issue persists

Timecode burn-in: Added visible timecode overlay to video frames and explicitly instructed Gemini to reference the burn-in - no improvement in accuracy

Video splitting: Split longer content into segments, but this creates worse results due to timecode offsets (e.g., segment starting at 25:00 mark produces very poor timestamp correlation)

Specific Issues
Timecodes jump erratically and don’t match actual content timing

Model appears to ignore explicit instructions to use visual timecode burn-ins

Temporal offset handling is particularly poor when processing video segments that don’t start at 00:00

Output timestamps sometimes extend beyond actual video duration

Questions
Has anyone found reliable workarounds for timestamp accuracy in Gemini 2.5 Pro?

Are there specific prompt engineering techniques that improve temporal correlation?

Would downgrading to Gemini 2.0 provide better timestamp accuracy?

Are there alternative approaches for handling video segments with temporal offsets?

Expected Outcome
Accurate timestamps that correspond to actual video content, enabling reliable time-based navigation and referencing.

3 Likes

Did anyone manage to transcribe a 25 minute video with correct timestamps? thanks for your help!!

1 Like

Hi @MaxNewman , 2.5 series models should offer significantly higher quality compared to the 2.0 model. By the way, have you tried it with the latest gemini-2.5-pro-preview-06-05?

1 Like

I’ve ran into all of those problems too.

What I discovered:

  1. Timestamp accuracy goes down the longer your video is.
  • 5 minutes is all right. 3 minutes is better.
  • You will need to split your long video into small segments for more accurate timestamps.
  1. Request timestamp in MM:SS format.
    Video understanding  |  Gemini API  |  Google AI for Developers
  • Timestamp format: When referring to specific moments in a video within your prompt, use the MM:SS format (e.g., 01:15 for 1 minute and 15 seconds).

The model seems to understand MM:SS best. This means you can’t really expect milliseconds accuracy for your timestamps.

1 Like