[Feature Request] Web UI Performance: Incremental Context Summarization to solve Chat Lag

Core Problem:
The Web UI (Gemini interface) becomes significantly laggy and resource-intensive when the conversation history grows large (high token count). Rendering the entire message thread in the DOM causes high CPU/RAM usage.

Proposed Solution:
Implement Incremental Context Summarization.

  1. When a certain token threshold is reached, the model should automatically generate a concise “State Summary” of the previous context (key technical facts, entities, and progress).

  2. The UI should then “purge” the old raw message logs from the browser’s active memory (DOM), replacing them with this single “Summary Block.”

  3. This summary is then prepended to subsequent prompts as a system instruction to maintain continuity.

Benefits:

  • Drastically improves browser performance for long technical sessions.

  • Prevents UI crashes and input lag.

  • Efficiently manages the Context Window without losing critical information.


P.S. This proposal was formulated and translated during a long technical session with Gemini. It’s a real-world example of why such optimization is needed!

Hi @Adil_Aliyev,
While we appreciate your detailed feedback, this forum is specifically dedicated to Google AI Studio and Gemini API inquiries. I suggest filing a report directly through the Help section of the Gemini App. Just navigate to Settings & help (the gear icon in the bottom left) > Send Feedback , and follow the prompts to submit your request with the relevant details. This will help ensure you receive the appropriate support.