Live API with ephemeral token ignores the system_instruction

Hi,

I eventually can use Live API with ephemeral token and with the websocket “wss://generativelanguage.googleapis.com/ws/google.ai.generativelanguage.v1alpha.GenerativeService.BidiGenerateContentConstrained?access_token=${apiKey}“. But unfortunately, it ignores the system_instruction I sent to the server. It works properly if I use just the Gemini api key with the websocket address “wss://generativelanguage.googleapis.com/ws/google.ai.generativelanguage.v1alpha.GenerativeService.BidiGenerateContent?key=${apiKey}“. Is this a known issue or limitation of the live api with ephemeral token?

Hi @Cahya_Wirawan,

Where are you defining instructions? Could you please share a snippet of your code which contains backend token generation logic?

Thank you!

Hi @Srikanta_K_N

Here is a snippet code to generate the ephemeral token, which is running on FastAPI backend:


def get_ephemeral_token(expire_minutes: int = 30) -> str:
    """
    Create an ephemeral auth token for starting a session.
    Useful for front-end clients that should not have long-lived API keys.
    """
    now = datetime.datetime.now(tz=datetime.timezone.utc)

    client = genai.Client(
        http_options={
            "api_version": "v1alpha",
        }
    )
    token = client.auth_tokens.create(
        config={
            "uses": 1,
            "expire_time": now + datetime.timedelta(minutes=expire_minutes),
            "new_session_expire_time": now + datetime.timedelta(minutes=1),
            "live_connect_constraints": {
                "model": "models/gemini-2.5-flash-native-audio-preview-12-2025",
                "config": {
                    "session_resumption": {},
                    "temperature": 0.7,
                    "response_modalities": ["AUDIO"],
                },
            },
            "http_options": {"api_version": "v1alpha"},
        }
    )
    return token.name


@app.get("/auth/gemini-token", response_model=dict)
def get_gemini_token(
    current_user: schemas.User = Depends(auth.get_current_user),
):
    token = get_ephemeral_token()
    # token = os.environ.get("GEMINI_API_KEY")
    if not token:
        raise HTTPException(status_code=500, detail="Gemini API Key not configured")
    return {
        "token": token,
        "user": {"uid": current_user.uid, "id": str(current_user.id)},
    }

The audio live conversation works, but it doesn’t know the data that I send as part of the

system instruction. I can send you the frontend part if it is needed.

Thanks.

Hi @Cahya_Wirawan, apologies for the delayed response.

After looking at your code, system_instruction is not part of your config.

Please try to add:

"system_instruction": {
                    "parts": [{"text": system_instruction}]
                },

Thank you!

Hi @Srikanta_K_N sorry I also just saw your answer.

Actually I add the system_instruction in the user client, not when I create the ephemeral token. We can’t put system_instruction in the config to generate the ephemeral token, I tried it, the ephemeral token generation just failed (it doesn’t make sense anyway to put system instruction in the token generation).

Here is the code in the user client. This code works if I use the backend as proxy, the Gemini knows the data in the system instruction. But if I use direct connection using ephemeral token, Gemini doesn’t know the data I put in the system instruction.


  const startSession = async () => {
    setStatus("connecting");
    try {
      let wsUrl = "ws://localhost:8000/ws/gemini-live"; // This is the local proxy URL for Gemini Live
      const mode = process.env.NEXT_PUBLIC_GEMINI_CONNECT_MODE || 'direct';

      // In 'direct' mode, get the Gemini ephemeral token from backend
      if (mode === 'direct') {
        const user = auth.currentUser;
        if (!user) {
            console.error("User not logged in");
            setStatus("idle");
            return;
        }
        const token = await user.getIdToken();
        // Get ephemeral Gemini token:
        const res = await fetch("http://localhost:8000/auth/gemini-token", {
            headers: { Authorization: `Bearer ${token}` }
        });
        if (!res.ok) throw new Error("Failed to get Gemini credentials");
        const data = await res.json();
        const apiKey = data.token;
        wsUrl = `wss://generativelanguage.googleapis.com/ws/google.ai.generativelanguage.v1alpha.GenerativeService.BidiGenerateContentConstrained?access_token=${apiKey}`;
      }

      const ws = new WebSocket(wsUrl);
      websocketRef.current = ws;

      ws.onopen = () => {
        console.log(`Connected to Gemini Live (${mode})`);
        setStatus("connected");
        setIsActive(true);
        
        const systemInstruction = `You are a helpful assistant. The user is asking about the following data:\n\n${dataContext}\n\nAnswer questions based on this data. Be brief and helpful.`;

        // Send setup message
        const setupMsg = {
          setup: {
            model: "models/gemini-2.5-flash-native-audio-preview-12-2025",
            generation_config: {
              response_modalities: ["AUDIO"],
              speech_config: {
                voice_config: { prebuilt_voice_config: { voice_name: "Laomedeia" } }
              }
            },
            system_instruction: {
                parts: [{ text: systemInstruction }]
            }
          }
        };
        ws.send(JSON.stringify(setupMsg));
        
        startAudioCapture();
      };

      ws.onmessage = async (event) => {
        let data = event.data;
        if (data instanceof Blob) {
            data = await data.text();
        }
        
        try {
            const response = JSON.parse(data);
            if (response.serverContent?.modelTurn?.parts) {
                for (const part of response.serverContent.modelTurn.parts) {
                    if (part.inlineData && part.inlineData.mimeType.startsWith("audio/pcm")) {
                        const audioData = part.inlineData.data;
                        playPcmAudio(audioData);
                    }
                }
            }
        } catch (e) {
            // console.error("Error parsing message", e); 
            // Ignore non-JSON messages if any
        }
      };
      
      ws.onclose = () => {
          console.log("Disconnected");
          stopSession();
      };
      
      ws.onerror = (e) => {
          console.error("WebSocket Error", e);
          stopSession();
      };

    } catch (error) {
      console.error(error);
      stopSession();
    }
  };