How to make a conversation with Gemini that supports pictures

David_Xu · December 3, 2024, 12:55pm

Hello. I’ve went through the Gemini API documentation and learned a bit about the image processing support. However, the documentation only explained the single-rounded usage, where the user provides a picture and a question and AI answers it. How to use it in a multi-rounded conversation environment?
For example:
User: text
Gemini: text
User: text + image
Gemini: text
…
How to send a base64 image through the send_message() function?

David_Xu · December 3, 2024, 1:00pm

OK, I found it out myself.
Just use the API like this:

image_path = "https://upload.wikimedia.org/wikipedia/commons/thumb/8/87/Palace_of_Westminster_from_the_dome_on_Methodist_Central_Hall.jpg/2560px-Palace_of_Westminster_from_the_dome_on_Methodist_Central_Hall.jpg"

image = httpx.get(image_path)

prompt = "Caption this image."
chat = models[0].start_chat(history=[])
response = chat.send_message([
    {'mime_type':'image/jpeg', 
     'data': base64.b64encode(image.content).decode('utf-8')}, 
    prompt
])

print(response.text)

MOHD_RAMLAN_BIN_M_RO · December 3, 2024, 1:29pm

Sure! Here are a few options:

Thanks for sharing your solution!
This is helpful, I’ll give it a try.
Great to see you figured it out!

Topic		Replies	Views
RE: How to make a conversation with Gemini that supports pictures Gemini API	2	105	December 16, 2024
Function calling with image return values Gemini API api , models , gemini-25	2	208	September 30, 2025
Multi-turn nano banana example? Gemini API image-generation	2	484	September 8, 2025
Image understanding does not seem to work using the Openai compatible API Gemini API issues , openai_compatibility	2	99	June 12, 2025
How to process uploaded image into a multimodal image content without using PIL on python? Gemini API api , python	3	102	May 21, 2025

How to make a conversation with Gemini that supports pictures

Related topics