Hi,
According to the https://ai.google.dev/gemini-api/docs/live-guide#context-window, sessions are limited to approximately 15 minutes without context window compression:
“Without compression, the practical session limit is approximately 15 minutes.”
However, I’m observing sessions significantly exceeding this limit in production. Looking at our Daily co (WebRTC) session logs, I see sessions lasting 27 minutes and even 51 minutes without compression enabled.
My setup:
- Model: gemini-2.5-flash-native-audio-preview-12-2025
- Using Pipecat framework with GeminiLiveLLMService
- No context_window_compression parameter set
Questions:
- Is the 15-minute limit an approximation that varies based on conversation density (tokens per minute)?
- Has this limit changed with the newer native audio models?
- What happens when the 128k context window fills up - does the session gracefully end or error?
Any clarification on the actual behavior would be helpful for capacity planning.
Thanks!