I would like to understand how to submit PDF and TXT files to Gemini models using the OpenAI SDK. The OpenAI documentation suggests this capability, but I’m unsure of the specific implementation for Gemini Models:
import fs from "fs";
import OpenAI from "openai";
const client = new OpenAI();
const file = await client.files.create({
file: fs.createReadStream("draconomicon.pdf"),
purpose: "user_data",
});
const response = await client.responses.create({
model: "gpt-4.1",
input: [
{
role: "user",
content: [
{
type: "input_file",
file_id: file.id,
},
{
type: "input_text",
text: "What is the first dragon in the book?",
},
],
},
],
});
console.log(response.output_text);
The preceding example is taken from the OpenAI documentation, which allows direct file input to the model, a highly convenient feature. However, I find the Gemini documentation to be more complex and potentially incomplete in its support for all available options.