Flash 2.5 PDF Analysis - AI Studio vs API

I am wondering if anybody else is having drastically different results when it comes to PDF analysis using the API vs AI studio?

Through the API I am uploading a base64 PDF, which works fine, but the analysis results are terrible/poor as it just makes up a bunch of information.
..
While the exact same prompt, system instructions, and document via AI Studio is perfect.

2 Likes

This appears to be happening with every PDF i Upload regardless of the model. I recently updated to to the new “npm i @google/genai” package from the “@google/generative-ai” package and it seems like using inline PDF’s is not working at all and the model is completely hallucinating.

I add the PDF’s to the chat history before sending the message via: const result = await chat.sendMessage({message});

pdfBase64 = fs.readFileSync(pdf, { encoding: 'base64' }).toString('base64');
                messageContent.push({
                    role: 'user',
                    parts: [
                        {
                            inlineData: {
                                mimeType: "application/pdf",
                                data: pdfBase64,
                                
                            }
                        },
                        {text: "File path and name: " + pdf}, 
                    ],
                });
1 Like

I would think the hallucinations are there is something wrong with the input that it’s receiving. I haven’t done anything with pdfs, but here is the documentation example for a local pdf file:

import { GoogleGenAI } from "@google/genai";
import * as fs from 'fs';

const ai = new GoogleGenAI({ apiKey: "GEMINI_API_KEY" });

async function main() {
    const contents = [
        { text: "Summarize this document" },
        {
            inlineData: {
                mimeType: 'application/pdf',
                data: Buffer.from(fs.readFileSync("content/343019_3_art_0_py4t4l_convrt.pdf")).toString("base64")
            }
        }
    ];

    const response = await ai.models.generateContent({
        model: "gemini-1.5-flash",
        contents: contents
    });
    console.log(response.text);
}

main();

I’m not quite positive, but something feels off with the shape of your InlineDataPart.

1 Like

Thanks, I actually figured it out… with the help of o4-mini hahah.

  • Yes the PDF was not being read at all and it was hallucinating using the prompt and system instructions.

With the old package (@google/generative-ai), I was able to overwrite the history when creating a chat completion: chat = ai.chats.create({...modelOptions});

That was maybe not the best way to do it but it worked fine… with the new @google/genai package, I just updated it so that the PDF is included in the new message being sent and everything works fine.

if (Array.isArray(pdfs) && pdfs.length) {
            for (const pdfPath of pdfs) {
            if (!pdfPath.toLowerCase().endsWith(".pdf")) continue;
                const data = fs.readFileSync(pdfPath, "base64");
                parts.push({
                    inlineData: { mimeType: "application/pdf", data },
                });
                parts.push({
                    text: `File path and name: ${pdfPath}`,
                });
            }
            parts.push({
                text: "If not given the any files, do not answer the question and report back that the file is not given or is incorrect. Ask if there are any other revelevant files as you can only access PDF's currently.",
            });
 }

 if (message) {
    parts.push({ text: message });
 }

console.log("Sending message to Gemini model..." + modelParams.modelName);
console.log("thinking Status: ", modelParams.thinking);

// Send the user message and get response
const result = await chat.sendMessage({message: parts});
2 Likes