When will Gemini support fine-tuning with images/video data?

jenway · August 14, 2024, 5:00pm

I have a task with video/images input and text as output which Gemini does not do very well. I want to fine-tune the model with my own data but it seems like currently Gemini does not support fine-tuning with multimodal images/video data. Since the new Gemini 1.5 pro and flash model do have the multimodal understanding capabilities, I was wondering when will Gemini support multimodal fine-tuning?

jenway · August 27, 2024, 5:06pm

Wondering do we have any update on this?

Topic		Replies	Views
Fine tuning a multimodal model Gemini API gemini-15 , api , fine-tuning	5	605	April 25, 2024
Gemini pro / flash multimodel finetuning Gemini API gemini-15 , api , models	1	206	August 19, 2024
Are we able to fine tune the video understanding on Gemini 2.5 Pro? Gemini API fine-tuning , vision , video , gemini-25	3	140	September 8, 2025
How to trun model with Gemini on Image input and text output? Gemini API	2	123	June 5, 2024
Can I fine-tune the Multimodal Live API? Google AI Studio api	2	145	July 20, 2025

When will Gemini support fine-tuning with images/video data?

Related topics