Optimal Video Pre-processing Parameters (FPS, Resolution) for File API

liao_aistoy · June 6, 2025, 10:10am

I have a question regarding the best practices for video processing before uploading files via the File API for video understanding. My goal is to optimize for performance, cost, and efficiency.
The official documentation (Video understanding | Gemini API | Google AI for Developers)states that video is processed at a 1 frame per second (fps) sample rate. This leads to a couple of key questions about pre-processing:

Frame Rate (FPS) Transcoding:
Given that the model samples video at 1fps, is it a recommended best practice to pre-process our videos and transcode them down to 1fps before uploading?
It seems this would significantly reduce the file size, leading to faster uploads and lower storage overhead, without any loss of information that the model would use. Is this assumption correct? Or is there any hidden benefit to uploading a video with a higher frame rate (e.g., 30/60 fps)?
Video Resolution:
I could not find any explicit guidance in the documentation regarding optimal or required video resolutions.
• Is there a recommended resolution (e.g., 480p, 720p, 1080p) for video uploads?
• Does providing a higher resolution (e.g., 1080p or 4K) improve the model’s performance on tasks like object detection or text recognition within the video?
• Or are frames downscaled to a standard internal resolution before processing, making it more efficient to simply upload a standard-definition video?

Krish_Varnakavi1 · June 6, 2025, 9:23pm

Firstly Welcome to the Google AI for Developers Forum!

Thank you for your thoughtful questions regarding video pre-processing for the Gemini File API. Let’s address your concerns:

Frame Rate (FPS) Transcoding:
The Gemini File API samples videos at 1 frame per second (FPS), as detailed in the Video understanding documentation. Pre-processing your videos to 1 FPS before uploading is a recommended practice. This approach reduces file size, leading to faster uploads and lower storage overhead, without any loss of information that the model would use. There’s no hidden benefit to uploading videos with higher frame rates (e.g., 30/60 FPS). In fact, transcoding to 1 FPS aligns with the model’s processing capabilities and optimizes performance.
Video Resolution:
The Gemini File API processes videos at a default media resolution. While higher resolutions (e.g., 1080p or 4K) may offer more detail, they do not necessarily improve the model’s performance on tasks like object detection or text recognition. In many cases, downscaling to a standard resolution (e.g., 720p) can be more efficient, as the model may internally downscale frames to a standard resolution before processing. Therefore, uploading videos at a standard resolution can help optimize performance and reduce processing time.

For more detailed information, please refer to the Video understanding documentation

If you have any further questions or need assistance with video pre-processing, feel free to ask!

Topic		Replies	Views
Gemini 2.0 - Video understanding Gemini API models , help_request	3	1621	May 2, 2025
File API, upload video, how to increase FPS? Gemini API	11	333	May 22, 2025
Any plans to increase FPS? Gemini API new-features	3	117	May 25, 2025
Can I downsample a video to 1 FPS on the client-side before sending it to Gemini API for processing? Gemini API gemini-api	1	34	March 13, 2025
Urgent: Significant Regression in File Status Transition to ACTIVE Gemini API bug , gemini	17	232	May 21, 2025

Optimal Video Pre-processing Parameters (FPS, Resolution) for File API

Related topics