How to trun model with Gemini on Image input and text output?

Wonwoo_Nam · June 5, 2024, 7:35pm

It says the tuning is only available for gemini 1.0 pro…and 1.0 pro doesn’t input images. Is this right? Is there no way for me to tune model with image input?

grandell1234 · June 5, 2024, 7:42pm

Not at the moment, you could use something like edgeimpulse.com, where you can train an image or video model for free.

It would be great to be able to train Gemini on your own images for your custom use cases.

afirstenberg · June 5, 2024, 8:09pm

Tuning is coming for Gemini 1.5, but details aren’t available yet.

Depending on your needs, you may also wish to take a look at PaliGemma from Google, which has image or text as input and text output.

Topic		Replies	Views
Fine tuning a multimodal model Gemini API gemini-15 , api , fine-tuning	5	607	April 25, 2024
When will Gemini support fine-tuning with images/video data? Gemini API	1	441	August 27, 2024
Fine tuning 1.5? Gemini API gemini-15 , fine-tuning	2	199	July 16, 2024
Input image and output json Gemini API fine-tuning	2	283	May 16, 2024
Are we able to fine tune the video understanding on Gemini 2.5 Pro? Gemini API fine-tuning , vision , video , gemini-25	3	140	September 8, 2025

How to trun model with Gemini on Image input and text output?

Related topics