action_chat = action_model.start_chat(history=)
ok , when our history becomes large, it would be smart to remove oldest entries, now i dont see, a way of removing old entries in history
action_chat = action_model.start_chat(history=)
ok , when our history becomes large, it would be smart to remove oldest entries, now i dont see, a way of removing old entries in history
History is an Iterable (fancy term for a list). One sure-fire way to drop the old part is to start a new list, copy from the old (current history) only as much as you like, then start a chat session and give it your new list as history.
Hope that helps!
but how do i access that history data?
That at least is documented for Python - google.generativeai.ChatSession | Google AI for Developers | Google for Developers
If chat is your chat object (of class ChatSession) you can do
for item in chat.history:
and do what needs to be done for each item (throw away the old, use .append() to include in the new list if you want to keep it
You can edit the history in-place.
>>>chat.history.pop(0)
parts {
text: "hey there, buddy"
}
role: "user"
>>>chat.history.pop(0)
parts {
text: "Hey there, friend! What\'s up? Just gnawing on some reeds here, enjoying the afternoon sun. How\'s your day going? 😁 \n"
}
role: "model"
Here, doing a “pop()” on index 0 of the list removes the oldest item (and returns the value).
That would allow you to discard items above a certain length, in pairs, or until reaching the role and part type “text” of a final output to a user.
You could have a more permanent attribute on chat where you store the unaltered chat, treating the existing history as a “live, what we’re going to send”, but parallel management would be difficult when you start to get to a GUI that would allow editing, temporarily disabling turns, removing wherever you want…
The list items are fairly unextensible. They are from the protocol buffers module, allowing only particular attributes.
>>>chat.history[0].__class__
<class 'google.ai.generativelanguage_v1beta.types.content.Content'>
Documentation:
from google.generativeai import protos
import inspect
print(inspect.getsource(protos.Content))
class Content(proto.Message):
r"""The base structured datatype containing multi-part content of a
message.
A ``Content`` includes a ``role`` field designating the producer of
the ``Content`` and a ``parts`` field containing multi-part data
that contains the content of the message turn.
Attributes:
parts (MutableSequence[google.ai.generativelanguage_v1beta.types.Part]):
Ordered ``Parts`` that constitute a single message. Parts
may have different MIME types.
role (str):
Optional. The producer of the content. Must
be either 'user' or 'model'.
Useful to set for multi-turn conversations,
otherwise can be left blank or unset.
"""
parts: MutableSequence["Part"] = proto.RepeatedField(
proto.MESSAGE,
number=1,
message="Part",
)
role: str = proto.Field(
proto.STRING,
number=2,
)
I dug into subclassing genai.GenerativeModel
and genai.ChatSession
to build a foundation for the next step, which is to control sending not by turns, but by a token budget.
A session that records the token usage returns now looks like:
>>>chat = model.start_chat()
>>>resp = chat.send_message("hey there, buddy")
>>>chat.history_tokens
[13, 35]
>>>chat.send_message("Where do you sleep?")
>>>chat.history_tokens
[13, 35, 57, 64]
>>>resp = chat.send_message("How many teeth do you have?")
>>>resp.text
"You know, I've never actually counted! But I've got a whole bunch of little sharp ones, just like most rodents. They're great for nibbling on reeds, bark, and even the occasional juicy water lily! I bet you have some pretty impressive teeth too, huh? 😉 \n"
>>>chat.history_tokens
[13, 35, 57, 64, 132, 64]
>>>model.count_tokens("How many teeth do you have?")
total_tokens: 15
>>>model.count_tokens(chat.history)
total_tokens: 199
However I can quickly see that using this built-in chat list, trying to rewrite tons of SDK to try to calculate the actual tokens per message by differences, and per varying system message state, to automatically discard above a passed token limit, especially with no token counting code provided, is going to be tons of work, inflexible, non-portable, and easy to break.
Nice that “chat” is there for experimentation, but it is not something to build on.
(the above responses are a chat with Elon the Muskrat…)