Understanding Token Counts

Hi,
I have a simple case that i want to understand how many tokens will be involved with and eventually how much will a session going to cost. Here is the scenario :
I am trying to count tokens in a chat session. I start with uploading one pdf file, with one page. That supposed to be 258 tokens accoriding the documentatio. Then , i initiate a chat with that doc, using :

chat_session = model.start_chat(
history=[
{
"role": "user",
"parts": [doc,
user_prompt,
],
}
]
)

doc: the doc i mentioned, user_prompt is my prompt with 185 tokens. Then , i send a message with chat_session.send_message(user_question). The user_question is another 19 tokens. I used model.count_tokens() function to get each of the token counts, this supposed to be 462 tokens. So why, when i check response.usage_metadata.prompt_token_count to see the input token count, I get a totally different count, of 1310. Where did this count came from ?
Thanks

1 Like

Hi @newbie_n00b , Welcome to the forum.

I tried to reproduce it on my end. The input token count seems to add correctly. Here is the colab gist file.

Hope this helps.

@GUNAND_MAYANGLAMBAM Thanks for trying !
I am uploading a pdf file (even if it’s just one page) and using gemini 2.0 flash.
See the attached screenshot here (without a pdf, just plain simple text).