Chat history. removing entries?

action_chat = action_model.start_chat(history=)

ok , when our history becomes large, it would be smart to remove oldest entries, now i dont see, a way of removing old entries in history

History is an Iterable (fancy term for a list). One sure-fire way to drop the old part is to start a new list, copy from the old (current history) only as much as you like, then start a chat session and give it your new list as history.

Hope that helps!

but how do i access that history data?

That at least is documented for Python - google.generativeai.ChatSession  |  Google AI for Developers  |  Google for Developers
If chat is your chat object (of class ChatSession) you can do

for item in chat.history:

and do what needs to be done for each item (throw away the old, use .append() to include in the new list if you want to keep it

1 Like

You can edit the history in-place.

>>>chat.history.pop(0)
parts {
  text: "hey there, buddy"
}
role: "user"

>>>chat.history.pop(0)
parts {
  text: "Hey there, friend!  What\'s up?  Just gnawing on some reeds here, enjoying the afternoon sun.  How\'s your day going?  😁 \n"
}
role: "model"

Here, doing a “pop()” on index 0 of the list removes the oldest item (and returns the value).

That would allow you to discard items above a certain length, in pairs, or until reaching the role and part type “text” of a final output to a user.

You could have a more permanent attribute on chat where you store the unaltered chat, treating the existing history as a “live, what we’re going to send”, but parallel management would be difficult when you start to get to a GUI that would allow editing, temporarily disabling turns, removing wherever you want…


The list items are fairly unextensible. They are from the protocol buffers module, allowing only particular attributes.

>>>chat.history[0].__class__
<class 'google.ai.generativelanguage_v1beta.types.content.Content'>

Documentation:

from google.generativeai import protos
import inspect
print(inspect.getsource(protos.Content))
class Content(proto.Message):
    r"""The base structured datatype containing multi-part content of a
    message.

    A ``Content`` includes a ``role`` field designating the producer of
    the ``Content`` and a ``parts`` field containing multi-part data
    that contains the content of the message turn.

    Attributes:
        parts (MutableSequence[google.ai.generativelanguage_v1beta.types.Part]):
            Ordered ``Parts`` that constitute a single message. Parts
            may have different MIME types.
        role (str):
            Optional. The producer of the content. Must
            be either 'user' or 'model'.
            Useful to set for multi-turn conversations,
            otherwise can be left blank or unset.
    """

    parts: MutableSequence["Part"] = proto.RepeatedField(
        proto.MESSAGE,
        number=1,
        message="Part",
    )
    role: str = proto.Field(
        proto.STRING,
        number=2,
    )

I dug into subclassing genai.GenerativeModel and genai.ChatSession to build a foundation for the next step, which is to control sending not by turns, but by a token budget.

A session that records the token usage returns now looks like:

>>>chat = model.start_chat()
>>>resp = chat.send_message("hey there, buddy")
>>>chat.history_tokens
[13, 35]
>>>chat.send_message("Where do you sleep?")
>>>chat.history_tokens
[13, 35, 57, 64]
>>>resp = chat.send_message("How many teeth do you have?")
>>>resp.text
"You know, I've never actually counted! But I've got a whole bunch of little sharp ones, just like most rodents.  They're great for nibbling on reeds, bark, and even the occasional juicy water lily!  I bet you have some pretty impressive teeth too, huh?  😉 \n"
>>>chat.history_tokens
[13, 35, 57, 64, 132, 64]
>>>model.count_tokens("How many teeth do you have?")
total_tokens: 15
>>>model.count_tokens(chat.history)
total_tokens: 199

However I can quickly see that using this built-in chat list, trying to rewrite tons of SDK to try to calculate the actual tokens per message by differences, and per varying system message state, to automatically discard above a passed token limit, especially with no token counting code provided, is going to be tons of work, inflexible, non-portable, and easy to break.

Nice that “chat” is there for experimentation, but it is not something to build on.

(the above responses are a chat with Elon the Muskrat…)

1 Like