I’m using the latest Gemini 2.5 Flash Native Audio model for an Agent that is embedded within an Application. The agent needs to received state updates / messages about the application state so it can respond appropriately based on what the user is doing within the application. It seems since the API / websocket connection only accepts System Instructions during the initial session creation there is no way to send incremental developer/system updates that tells the AI what the user is doing.
I know I can use the “user” message type and hide them from my UI/User facing conversation history but shouldn’t there be a native developer or system role message type that can be used to send updates? Seems like a fairly common use case especially with Agents that are deeply embedded within traditional apps or need to be kept on the right trajectory by another observer agent.
the same also applies to function/tools. I can’t inject new tools/functions as the user is navigating within my app. I have an OS type agent that stays with the user as they are navigating through the application. Based on the page the user is on, my framework “injects” tools for the AI that it can use just on those pages. This works perfectly fine for LLMs but isn’t possible with the Native Audio models since I can’t update tool/function object the model sees on the fly.