API is providing cropped images as "hints" to Gemini 2.5 Flash?

jrich · June 19, 2025, 5:05pm

It’s possible this is already well-known here, but I just stumbled into the discovery that when I use the API to send a pdf file to the model (and I think the same is true for image files), it’s not just the original pdf image which is input into the model, but also 4 cropped versions of the image (along with the OCR raw text). Firstly, this is reflected if I call count_tokens on the Part object with the pdf file - there are 5x as many tokens as one image should generate. Additionally, if I prompt the model after the file with how many total images did i just show you?, the model will respond with You showed me a total of 5 images: one original image and four different crops of that image.

After some prompting, the model will also reveal the text accompanying the images when they were fed into the model as the following:

Here is the original image:

and here are the different crops of this image to help you see better, use these only as hints:

This is all fine, but the problem is that when I play this game in the AI Studio, it turns out that the model there is being fed 9 crops of the image (I would imagine a 3x3 grid perhaps) in addition to the original, which resulted - obviously - in a discrepancy between the responses in the Studio and when I called the API directly.

So the question is: is this configurable? Can I decide the level of crop-granularity that I want, or possibly disable this cropping behavior altogether? Or am I stuck here not knowing what will occur under the hood when I upload a file?

Lucas_Smiles · June 22, 2025, 12:54am

I just discovered this aswell! I find it really weird, as it does this for a normal image aswell, not just PDFs.

GUNAND_MAYANGLAMBAM · June 25, 2025, 8:30am

Hi, thank you for bringing this bug to our attention. I have reviewed it myself and noticed the same problem. I am forwarding this to the engineering team for further investigation.

Topic		Replies	Views
Inconsistent Image Tokenization Behavior in Gemini 2.0 When Using Sample Images in Prompt Gemini API gemini , issues	1	97	June 13, 2025
Flash 2.5 PDF Analysis - AI Studio vs API Gemini API ai-studio , api	3	220	April 19, 2025
How does Gemini see images in chat? A little research Google AI Studio gemini-15 , ai-studio , api , models	3	208	October 13, 2024
Gemini-2.0-Flash-Preview-Image-Generation quality reduction in recent update Gemini API models , gemini-flash , gemini-20	13	530	June 28, 2025
[Critical] Gemini 2.5 Pro Response Error, Possible Memory Error? Gemini API thinking , gemini-2-5	8	212	June 19, 2025

API is providing cropped images as "hints" to Gemini 2.5 Flash?

Related topics