Inconsistent Image Tokenization Behavior in Gemini 2.0 When Using Sample Images in Prompt

Hi,

I’m working with the Gemini 2.0 flash lite model and have encountered an unexpected behavior related to image input sizes and token count.

Here’s what I’m seeing:

  • When I pass a single image along with a prompt, the image (e.g., sized larger than 384×384) gets processed and takes up ~1400 tokens, which makes sense.
  • However, when the prompt also includes sample/reference images (e.g., for in-context examples), all images — including my main input image — are tokenized as 258 tokens per image regardless of their dimensions
  • I tried resizing the images manually to optimize token usage, but in the second case (with sample images), manual resizing seems to have no effect on token count.

I couldn’t find anything in the Gemini docs that describes this behavior. Is this expected?

Any guidance or clarification would be appreciated.

Thanks!

Hello,

Thank you for bringing this to our attention. We’ve investigated the issue you described and were able to reproduce it on our end. We are now escalating this matter to our specialists to gather more information and determine the best course of action. We appreciate your patience and will provide an update as soon as we have more clarity.

1 Like