How to estimate cost of a chat with system instructions, tools and history?

I am using count_tokens method of the model to estimate cost based on the pricing model for input and output characters. I do the estimation based on the total_billable_characters . In order to count separately input and output characters, I filter the messages by role and first count the user characters, than model characters.

I can see system instructions are being added automatically, so I guess in this way I am counting them twice.

I assume chat history is also counting against input characters, but I am not sure if I have to count it once - at the end of the chat, or for each user prompt to count all the previous history and add it cumulatively to the total input characters.

I am also using tools and I add them to the count_tokens arguments. I guess this should be enough to factor in tools as input.

I am using Gemini 1.5 Flash with Python SDK.

AFAIK, it is counted for every request

1 Like