I am using count_tokens
method of the model to estimate cost based on the pricing model for input and output characters. I do the estimation based on the total_billable_characters
. In order to count separately input and output characters, I filter the messages by role and first count the user characters, than model characters.
I can see system instructions are being added automatically, so I guess in this way I am counting them twice.
I assume chat history is also counting against input characters, but I am not sure if I have to count it once - at the end of the chat, or for each user prompt to count all the previous history and add it cumulatively to the total input characters.
I am also using tools and I add them to the count_tokens
arguments. I guess this should be enough to factor in tools as input.
I am using Gemini 1.5 Flash with Python SDK.