Mismatch image token count. 11x difference?

Hi, I am using gemini flash 2 with image input. While count_tokens return 262, the actual number of token used from the response is 2838! The size of the image is (1250, 356)
Can anyone explain this to me.

Hi @Hoang_Dinh_Thanh,

Welcome to forum, Yes its a genuine bug, team is working on this issue to be fixed.

Thank you

is there any update on this problem?

Hi @Hoang_Dinh_Thanh,

Sorry for the delay but still the fix is in WIP.

Thank you

It’s been 2 months, is there any update?

Hi @Hoang_Dinh_Thanh ,

Sorry for the late response !!
I checked from my end , the issue is fixed now. I would request you please check from your end and let us know your feedback.

Thank you !!

I can confirm this is not fixed
384x384 image is 256 tokens after processing as expected
768x768 is 1200~ tokens

✅ Token Counting Results:

  Input Token Count (Pre-request):

  - Total Tokens: 267 (from countTokens API call)
  - Text: 9 tokens ("Describe what you see in this image.")
  - Image: 258 tokens (768x768 JPEG)

  Response Usage Metadata (Post-request):

  - Prompt Tokens: 1,298
  - Response Tokens: 40
  - Total Tokens: 1,338

This is with @google/genai 1.7.0.

Can we get a confirmation that a 768x768 should process as 258 tokens as per docs or is this an estimation issue and the actual logic behind the token count is undocumented?