Unexpected Token Count Accumulation in Google Generative AI Flutter Package

I am using the Google Generative AI package in my Flutter app to interact with Gemini models. I send messages using _chat.sendMessage(Content.text()). However, I have noticed an issue with token usage metadata, specifically with the prompt token count.

Issue Explanation:

When I send a message, the prompt token count should reflect only the tokens in my latest input. However, instead of counting just the new prompt tokens, it keeps accumulating previous prompt and candidate token counts.

Here’s an example of what happens:

  1. First message:
  • User: "Hi"Prompt token count: 1
  • Model Response: "Hello, how are you?"Candidate token count: 5
  1. Second message:
  • User: "I am fine"
  • Expected prompt token count: 3 (since "I am fine" has 3 tokens)
  • Actual prompt token count: 9 (previous prompt: 1 + previous candidate: 5 + current prompt: 3)

This means the prompt token count is increasing incorrectly, as it includes tokens from previous messages instead of just the latest prompt.

Why is this a problem?

  • Inefficient token usage: The prompt token count should reflect only the current input, not past messages.
  • Increased costs: Since prompt tokens contribute to billing, this leads to unnecessary extra charges.
  • Unintended context behavior: It seems like the entire conversation history is being resent automatically, which I do not want.

Questions I Need Answers To:

  1. Why is the _chat.sendMessage(Content.text()) method accumulating previous tokens?
  2. Is this expected behavior or a bug in the Google Generative AI package?
  3. How can I ensure that only the current prompt’s tokens are counted?
  4. Is there a way to use context efficiently without resending previous messages?
  5. Will I be charged for the full accumulated prompt token count (9 in the example above), or only for the new prompt tokens (which should be 3)?

Would appreciate any insights or workarounds! If anyone can help, please reach out to me “Modified by moderator”

Thanks in advance!

Hi @Polo_Denver, It used to answer questions based on the previous input.

Generally this is the expected behavior of the chat session

You can go for Generate response instead of chat session or you can manually reset the chat history by assigning the empty list to chat.history after sending every message.

one way you do is you can summarize the previous discussion and pass it with the current prompt.

I think the billing will be for the total no,of inputs tokens passed and output tokens generated.

Thank You.