There’s no such example in the API docs for this case.
Hey @Muhammad_Zafar , Welcome to the forum.
Please refer the sample code below for structured output using image input
from google import genai
from pydantic import BaseModel
class Json(BaseModel):
title: str
description: str
client = genai.Client(api_key="GEMINI_API_KEY")
files = [
client.files.upload(file="sample.jpg"),
]
response = client.models.generate_content(
model='gemini-2.0-flash',
contents=['Give me title and description of the image.', files],
config={
'response_mime_type': 'application/json',
'response_schema': Json,
},
)
# Use the response as a JSON string.
print(response.text)
Thank you very much for your response, is it possible with JavaScript as well?
Yes, JavaScript SDK is supported.
I mean can we get an example for it as well, as I couldn’t find it in docs. Thanks.
Still waiting, help please
You can try the code below:
import {
GoogleGenerativeAI,
SchemaType,
} from "@google/generative-ai";
import fs from "fs";
import dotenv from 'dotenv';
dotenv.config();
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
// Converts local file information to base64
function fileToGenerativePart(path, mimeType) {
return {
inlineData: {
data: Buffer.from(fs.readFileSync(path)).toString("base64"),
mimeType
},
};
}
async function run() {
const schema = {
description: "List of cities",
type: SchemaType.ARRAY,
items: {
type: SchemaType.OBJECT,
properties: {
city: {
type: SchemaType.STRING,
description: "Name of the city",
nullable: false,
},
},
required: ["city"],
},
};
const model = genAI.getGenerativeModel({
model: "gemini-1.5-pro",
generationConfig: {
responseMimeType: "application/json",
responseSchema: schema,
},
});
const prompt = "List all the cities from the given image";
const imageParts = [
fileToGenerativePart("/sample.png", "image/png")
];
const generatedContent = await model.generateContent([prompt,imageParts]);
console.log(generatedContent.response.text());
}
run();
line 1: import { GoogleGenerativeAI, SchemaType } from “@google/generative-ai”;
ERROR: Module ‘“@google/generative-ai”’ has no exported member ‘SchemaType’.
line 38: responseMimeType: “application/json”,
ERROR: Object literal may only specify known properties, and ‘responseMimeType’ does not exist in type ‘GenerationConfig’
line 47: const generatedContent = await model.generateContent([prompt, imageParts]);
ERROR: Type ‘{ inlineData: { data: string; mimeType: any; }; }’ is not assignable to type ‘string | Part’.
Can you try updating to the latest version and see if it works?
npm install @google/generative-ai@0.24.0
Fixed!! Thank you so much, I have been wanting to do this for weeks. Can we get structured output for audio/video input as well?
Absolutely, you can get structured output for both audio and video as well.