Strange image token counting with gemini-2.0-flash

Roy_Ca · May 8, 2025, 7:54am

I am trying to understand the token count when using images with different resolutions.
I am aware of the documented behavior:

and specifically:

With Gemini 2.0, image inputs with both dimensions <=384 pixels are counted as 258 tokens. Images larger in one or both dimensions are cropped and scaled as needed into tiles of 768x768 pixels, each counted as 258 tokens. Prior to Gemini 2.0, images used a fixed 258 tokens.

But my results do not match the documented behavior, or maybe I’m missing something.

My sample code below.
When counting tokens either with client.models.count_tokens or by looking at the usage data in the response, but also doing this in the VertexAI playground, I get these numbers:

1 image 1920x1080 with a short text prompt (“describe”) results in over 1800 input tokens
1 image 1280x720 with a short text prompt (“describe”) results in over 1800 input tokens
1 image 640x360 with a short text prompt (“describe”) results in over 1800 input tokens
1 image 320x180 with a short text prompt (“describe”) results in 270 input tokens
sending 4 images - 1920x1080 each results in a little over 1000 input tokens

The “1 image 320x180” with 270 matches the documented behavior for an image with less than 384 on each side, but the rest does not have lower token count as resolution is reduced.
for example, I would expect 640x360 to be one tile of 768x768 and with 258 tokens. but it is not the case.

Also, how come sending 4 images results in around 1000 tokens? I seems that in this case that each images is reduced internally or something and counts as 258 tokens and do not keep the original image resolution.

I would love some help understanding what’s going on there.
Thank you!

Couchraver · May 8, 2025, 3:07pm

Isn’t this a way for Gemini to limit the tokens according to the window and the resolution of the images it will process? I mean. More resolution translates into more tokens, but probably there’s some image recognition running in the background deciding how much tokens to “invest” on a picture/ set of pictures in order to save resources.
(Probably according to some rule based on the amount of details the pic has)

Topic		Replies	Views
Token counts mismatch - 9x discrepancy! Gemini API bug , api	9	210	April 17, 2025
Image pricing for Gemini 2.o Flash Gemini API gemini-flash , gemini-20	2	224	March 5, 2025
### 📌 Questions for the Google Gemini API Team Gemini API api	1	58	March 12, 2025
Token Count Differences between google-generativeai and OpenAI API for Gemini in Python Gemini API open-models , ai	2	100	February 25, 2025
Mismatch image token count. 11x difference? Gemini API gemini-flash	3	81	April 15, 2025

Strange image token counting with gemini-2.0-flash

Related topics