Is there a way in the Gemini Live API to detect “near end of generation” (not just final completion)?

Hey all, I’m working with the Gemini Live API (via WebSockets) and using it for streaming LLM output in real-time. I understand that the API exposes signals like generationComplete and turnComplete, which tell me when the model has finished its current output. I can react to those cleanly in my client or backend.

What I need is something a bit different:
Instead of waiting until the model is done, I want a way to detect when the model is getting close to done — so I can call a function, update my UI, prep the next turn, or transition my state before the final completion happens.

Right now my model pipeline looks like this:

  1. Client opens a gemini.live.connect session.

  2. I stream text/audio and receive chunks back from the model.

  3. I watch for server_content.generation_complete or server_content.turn_complete to know the reply is finished.

But there doesn’t seem to be any built-in “N tokens left” or “almost done” event in the Gemini Live spec that gets emitted before generationComplete. The API docs only define the normal completion flags — no progress percentage or remaining token info.

Before I build a heuristic (like counting streamed tokens/chars and calling my callback when some threshold is met), I wanted to check:

  • Has anyone seen undocumented or hidden signals that signal “approaching end of generation”?

  • Are there better client-side heuristics people use in Gemini Live if they need early notice of an ending?

  • Or is the community just using generationComplete as the de facto only reliable signal?

For context: I’m aware this isn’t about end of turn detection or voice activity detection — I’m talking strictly about approaching end of the model’s text/audio generation while it’s streaming.

Thanks!