How to trun model with Gemini on Image input and text output?

It says the tuning is only available for gemini 1.0 pro…and 1.0 pro doesn’t input images. Is this right? Is there no way for me to tune model with image input?

Not at the moment, you could use something like edgeimpulse.com, where you can train an image or video model for free.

It would be great to be able to train Gemini on your own images for your custom use cases.

Tuning is coming for Gemini 1.5, but details aren’t available yet.

Depending on your needs, you may also wish to take a look at PaliGemma from Google, which has image or text as input and text output.