Live API with ephemeral token ignores the system_instruction

Cahya_Wirawan · December 26, 2025, 11:12am

Hi,

I eventually can use Live API with ephemeral token and with the websocket “wss://generativelanguage.googleapis.com/ws/google.ai.generativelanguage.v1alpha.GenerativeService.BidiGenerateContentConstrained?access_token=${apiKey}“. But unfortunately, it ignores the system_instruction I sent to the server. It works properly if I use just the Gemini api key with the websocket address “wss://generativelanguage.googleapis.com/ws/google.ai.generativelanguage.v1alpha.GenerativeService.BidiGenerateContent?key=${apiKey}“. Is this a known issue or limitation of the live api with ephemeral token?

Srikanta_K_N · December 30, 2025, 4:36am

Hi @Cahya_Wirawan,

Where are you defining instructions? Could you please share a snippet of your code which contains backend token generation logic?

Thank you!

Cahya_Wirawan · January 3, 2026, 11:33am

Hi @Srikanta_K_N

Here is a snippet code to generate the ephemeral token, which is running on FastAPI backend:


def get_ephemeral_token(expire_minutes: int = 30) -> str:
    """
    Create an ephemeral auth token for starting a session.
    Useful for front-end clients that should not have long-lived API keys.
    """
    now = datetime.datetime.now(tz=datetime.timezone.utc)

    client = genai.Client(
        http_options={
            "api_version": "v1alpha",
        }
    )
    token = client.auth_tokens.create(
        config={
            "uses": 1,
            "expire_time": now + datetime.timedelta(minutes=expire_minutes),
            "new_session_expire_time": now + datetime.timedelta(minutes=1),
            "live_connect_constraints": {
                "model": "models/gemini-2.5-flash-native-audio-preview-12-2025",
                "config": {
                    "session_resumption": {},
                    "temperature": 0.7,
                    "response_modalities": ["AUDIO"],
                },
            },
            "http_options": {"api_version": "v1alpha"},
        }
    )
    return token.name


@app.get("/auth/gemini-token", response_model=dict)
def get_gemini_token(
    current_user: schemas.User = Depends(auth.get_current_user),
):
    token = get_ephemeral_token()
    # token = os.environ.get("GEMINI_API_KEY")
    if not token:
        raise HTTPException(status_code=500, detail="Gemini API Key not configured")
    return {
        "token": token,
        "user": {"uid": current_user.uid, "id": str(current_user.id)},
    }

The audio live conversation works, but it doesn’t know the data that I send as part of the

system instruction. I can send you the frontend part if it is needed.

Thanks.

Srikanta_K_N · January 8, 2026, 10:40am

Hi @Cahya_Wirawan, apologies for the delayed response.

After looking at your code, system_instruction is not part of your config.

Please try to add:

"system_instruction": {
                    "parts": [{"text": system_instruction}]
                },

Thank you!

Cahya_Wirawan · January 16, 2026, 8:05am

Hi @Srikanta_K_N sorry I also just saw your answer.

Actually I add the system_instruction in the user client, not when I create the ephemeral token. We can’t put system_instruction in the config to generate the ephemeral token, I tried it, the ephemeral token generation just failed (it doesn’t make sense anyway to put system instruction in the token generation).

Here is the code in the user client. This code works if I use the backend as proxy, the Gemini knows the data in the system instruction. But if I use direct connection using ephemeral token, Gemini doesn’t know the data I put in the system instruction.


  const startSession = async () => {
    setStatus("connecting");
    try {
      let wsUrl = "ws://localhost:8000/ws/gemini-live"; // This is the local proxy URL for Gemini Live
      const mode = process.env.NEXT_PUBLIC_GEMINI_CONNECT_MODE || 'direct';

      // In 'direct' mode, get the Gemini ephemeral token from backend
      if (mode === 'direct') {
        const user = auth.currentUser;
        if (!user) {
            console.error("User not logged in");
            setStatus("idle");
            return;
        }
        const token = await user.getIdToken();
        // Get ephemeral Gemini token:
        const res = await fetch("http://localhost:8000/auth/gemini-token", {
            headers: { Authorization: `Bearer ${token}` }
        });
        if (!res.ok) throw new Error("Failed to get Gemini credentials");
        const data = await res.json();
        const apiKey = data.token;
        wsUrl = `wss://generativelanguage.googleapis.com/ws/google.ai.generativelanguage.v1alpha.GenerativeService.BidiGenerateContentConstrained?access_token=${apiKey}`;
      }

      const ws = new WebSocket(wsUrl);
      websocketRef.current = ws;

      ws.onopen = () => {
        console.log(`Connected to Gemini Live (${mode})`);
        setStatus("connected");
        setIsActive(true);
        
        const systemInstruction = `You are a helpful assistant. The user is asking about the following data:\n\n${dataContext}\n\nAnswer questions based on this data. Be brief and helpful.`;

        // Send setup message
        const setupMsg = {
          setup: {
            model: "models/gemini-2.5-flash-native-audio-preview-12-2025",
            generation_config: {
              response_modalities: ["AUDIO"],
              speech_config: {
                voice_config: { prebuilt_voice_config: { voice_name: "Laomedeia" } }
              }
            },
            system_instruction: {
                parts: [{ text: systemInstruction }]
            }
          }
        };
        ws.send(JSON.stringify(setupMsg));
        
        startAudioCapture();
      };

      ws.onmessage = async (event) => {
        let data = event.data;
        if (data instanceof Blob) {
            data = await data.text();
        }
        
        try {
            const response = JSON.parse(data);
            if (response.serverContent?.modelTurn?.parts) {
                for (const part of response.serverContent.modelTurn.parts) {
                    if (part.inlineData && part.inlineData.mimeType.startsWith("audio/pcm")) {
                        const audioData = part.inlineData.data;
                        playPcmAudio(audioData);
                    }
                }
            }
        } catch (e) {
            // console.error("Error parsing message", e); 
            // Ignore non-JSON messages if any
        }
      };
      
      ws.onclose = () => {
          console.log("Disconnected");
          stopSession();
      };
      
      ws.onerror = (e) => {
          console.error("WebSocket Error", e);
          stopSession();
      };

    } catch (error) {
      console.error(error);
      stopSession();
    }
  };

Topic		Replies	Views
Transcript on live audio not been passed back during conversation (ephemeral tokens auth) Gemini API models , audio , live-streaming	6	132	October 13, 2025
sessionResumption not working Gemini API api , gemini	9	152	January 2, 2026
Ephemeral tokens doesn't work with lyria realtime Gemini API api , live-streaming	7	213	August 28, 2025
Gemini Api Ephemeral token not working Gemini API api , live-streaming	4	336	October 1, 2025
Live API + Ephemeral Token: No Input/Output Transcription (Audio replies work but no transcription events) Google AI Studio ai-studio , audio	1	61	December 30, 2025

Live API with ephemeral token ignores the system_instruction

Related topics