Inconsistent Gemini Output Token Count with Varying Image Sizes (Same Prompt)

I’m seeing inconsistent input_token_count from the Genai API with gemini-2.0 series when using the exact same text prompt but with different image sizes.

My understanding from the token counting documentation token calcualation details suggests output tokens should be more consistent with a fixed prompt.

Why would varying image size affect the output token count for the same prompt? Is this expected behavior?

Hi @npic,

Having different input token count for large images for different models is expected due to tiling as mentioned in the doc.
How much difference you are seeing in the output tokens?? Is it large difference?

Hi @Govind_Keshari,

Thanks. My issue isn’t with different models, but with the same model (in my case was gemni 2.0 flash) and inconsistent input image token counts.

I’ve re-tested using the new GenAI SDK, which shows separate image/text tokens at the last chunk. However, my input image token calculations are still not predictable; even the biggest images or largest output tokens don’t correlate with the input image token count.

This is a complete nonsense case: the image is 705x44, smaller than 768x768 pixels for a single tile, yet the token counter still cost us 3354 tokens.
image

Could you clarify the rules for image tiling and how it determines input image token count?

Hey @npic, Thanks for confirming. I will double check on this issue.