How to get structured output for image input

There’s no such example in the API docs for this case.

Hey @Muhammad_Zafar , Welcome to the forum.

Please refer the sample code below for structured output using image input

from google import genai
from pydantic import BaseModel


class Json(BaseModel):
  title: str
  description: str


client = genai.Client(api_key="GEMINI_API_KEY")

files = [
        client.files.upload(file="sample.jpg"),
    ]
response = client.models.generate_content(
    model='gemini-2.0-flash',
    contents=['Give me title and description of the image.', files],
    config={
        'response_mime_type': 'application/json',
        'response_schema': Json,
    },
)
# Use the response as a JSON string.
print(response.text)
1 Like

Thank you very much for your response, is it possible with JavaScript as well?

Yes, JavaScript SDK is supported.

I mean can we get an example for it as well, as I couldn’t find it in docs. Thanks.