I use gemini pro with the one million token context on aistudio. I recently came across an interesting bug that I think is important enough to give feedback to the google ai team.
So I maxed out tokens on one conversation, and become I needed it to continue, I had to find a solution to restore the model’s memory so I collected a json save file of the conversation on google drive, and used some python code to extract the conversation from it, resulting in a 700000 token conversation.
I gave the conversation to another instance of gemini pro, and this restored its memory but created the bug I was talking about in the title. Because my conversation file is so long, hundreds of messages with the shape
[user] my question
[model] answer
the model started to regularly hallucinate a conversation with me. I’ll send one or two questions as usual and it starts creating questions from an hallucinated [user] and answer.
The reason I’m making this feedback file is because the loop is endless, it doesn’t stop ever, it will generate conversation until max token is reached.
This rare bug could be a grave problem to your services at it could bill hallucinated conversations to user (I’m on a free plan) and, as I don’t know if there is a token limit on the api, it could potentially create a “broken water faucet” situation for some users.