Google AI Studio becomes sluggish when the chat exceeds approximately 30,000 tokens of pure input (i.e. no files), thinking’s, and output.
It freezes when submitting/editing a prompt, starting to think/generate a response, and saving.
Autosave sometimes fails, resulting in an infinite loop, meaning I lose the current prompt thereafter.
I’m also noticing significant performance slowdowns in Google AI Studio with extended chats, especially on less powerful devices. Like you, I’ve experienced freezing, lag when typing or generating responses, and autosave issues in long, text-heavy conversations, similar to what many users report with ChatGPT and other LLMs as well. It’s definitely a challenge when trying to have in-depth, extended interactions.
I’ve been thinking about potential ways to optimize performance in these long chat scenarios, and I wanted to share an idea that might be worth considering for the AI Studio development team:
Suggestion: Implement “Renderer Message Clearing” for Long Chats
The idea is to add an option (or perhaps an automatic feature) to clear or hide older messages from the display window after a certain point in the conversation (e.g., after 50-100 exchanges, or based on token count). This would significantly reduce the rendering load on the user’s device, as the browser wouldn’t have to constantly redraw and manage a massive chat history in the display.
Crucially, the full chat history would still be retained in the backend memory for the AI’s contextual awareness. The AI would still have access to the entire conversation history for context, but the display would only show a manageable, recent portion of the chat.
This could potentially offer a good balance: improve performance and responsiveness for users, especially on lower-powered devices, by reducing rendering load, while still allowing the AI to maintain full contextual understanding from the complete chat history.
(Note: I had already discussed this with an LLM prior, so I did not want to rewrite it, the idea, suggestion and opinion are my own.)
I wrote a python program that clears a percentage of the display windows conversation. you pass it the file path to gDrive and, by default it will clear 25% of the file, maintaining json/Gemini format.
Before that my models became sluggish when that file reached more than 1 meg.
But is the context, information still there for the model to access?
Or is that information permanently gone?
Like, in the AI view, can the AI still recall it.
Anyways, thank you for your idea. It seems very interesting. Would you publish the script on to GitHub?
hi there,
I have not seen any degradation of IA memory folowing resizing of coversations
sorry I do not have GitHub account
I can send you source code if you want…
rd