Gemini Flash Live API: How to ensure the model always uses the latest user-provided context after a sequence of context + audio turns?

Tushar_Raina · May 19, 2025, 2:53am

I’m building an application using the Google Gemini Flash Live API (genai) where the user can send updated context (for example, a new code snippet or document text) followed by live audio input (e.g., asking a question about the latest context they just provided).

My goal is for Gemini to always have the most recent context as the basis for its response—so that, for example, if the user asks “Can you see my latest version?” right after sending an update, the model’s answer accurately reflects the latest content.

Problem:
Even though I send the updated context using sendClientContent (as a user turn), if the user then speaks (audio streamed live to Gemini), the model sometimes replies with hallucinated, old, or unrelated content, as if it did not receive the latest context.

What I’m doing:

On context update (e.g., new code or document text):

await sessionInstance.sendClientContent({
    turns: [{
        role: 'user',
        parts: [{ text: `Here is my latest update:\n\`\`\`\n${latestContent}\n\`\`\`\n` }]
    }],
    turnComplete: false
});

On audio (raw audio buffer from the client):

await sessionInstance.sendRealtimeInput({
    media: {
        data: audioData.toString('base64'),
        mimeType: 'audio/pcm;rate=16000'
    }
});

(I’ve also tried sending the context again as a sendClientContent turn immediately before each audio input, but that doesn’t seem to work reliably.)

Relevant Backend Handler:

ws.on('message', async (data: WebSocket.RawData) => {
    try {
        const messageStr = data.toString();

        if (messageStr.trim().startsWith('{')) {
            const message = JSON.parse(messageStr);
            if (message.type === 'context_update') {
                session.context = message.content;
                await sessionInstance.sendClientContent({
                    turns: [{
                        role: 'user',
                        parts: [{ text: `Here is my latest update:\n\`\`\`\n${message.content}\n\`\`\`\n` }]
                    }],
                    turnComplete: false
                });
            }
        } else {
            // AUDIO: (I've also tried sending the context here, see below)
            await sessionInstance.sendRealtimeInput({
                media: {
                    data: data.toString('base64'),
                    mimeType: 'audio/pcm;rate=16000'
                }
            });
        }
    } catch (e) {
        // error handling...
    }
});

What I’ve Tried:

Sending the context as a sendClientContent turn immediately before each audio input (with turnComplete: false or true).
Waiting for Gemini’s response after just a context update: Gemini does not reply until audio or a user question is sent.
Changing how much context I include (full content, diffs, etc).

No matter what I do, sometimes Gemini answers with content that is not the latest, or hallucinates.

Question:
How can I reliably ensure that the Gemini Flash Live API always uses the latest user-provided context for the next audio/user question turn?
Is there an official/recommended pattern to “bind” a context update and an audio question together, or to always force the model to answer about the latest user content?

Relevant API Docs:

Any help or relevant code pattern is appreciated!

Topic		Replies	Views
`turnComplete` flag set to `false` in `ClientContentMessage` of Multimodal Live API prevents processing of subsequent `RealtimeInputMessage` Gemini API api	3	63	January 27, 2025
Will it be possible to receive text and audio data in the multimodal API? Gemini API models , gemini-api	12	690	June 12, 2025
Is there a way to tell the API to prioritize answers based on the context/system_instruction over chat history? Gemini API gemini-15	4	144	May 1, 2024
Real time, gemini 2 audio change? how to? Gemini API models , audio	4	382	January 9, 2025
Model cannot focus on most recent user request when function calling Gemini API api , models	6	228	February 2, 2025

Gemini Flash Live API: How to ensure the model always uses the latest user-provided context after a sequence of context + audio turns?

Related topics