I keep seeing developers going through Livekit to deliver gemini 3.1 flash live with the justification that it is necessary to avoid network issues, but the gemini app delivers it with its own implementation that is not Livekit. I can’t fathom why this robustness would not be included in the final API, so I’m wondering if it makes sense to setup my project to use Livekit from the get go, or maybe wait and see.
Hi Bill,
This is a common question for developWill gemini 3.1 flash live Final obviate need for Livekit implementation?ers.The Gemini 3.1 Flash Live API is powerful. However, has a different purpose. The Gemini app’s “own implementation” is a custom, first-party solution that Google built to handle the complexities of real-world networking. is designed to solve these complexities for third-party developers.Many developers are still using even as the final API matures. Here’s why:WebRTC vs. WebSockets: The native Gemini Live API mainly uses WebSockets for bidirectional streaming. WebSockets are not as resilient as WebRTC for low-latency, real-time media across unstable mobile networks. provides that WebRTC transport layer.Infrastructure & Scaling: For a production app, global edge routing and scaling are important. Partners like provide the infrastructure to ensure your “agent” can handle many concurrent, low-latency sessions worldwide without managing the server-side media stack.Turn Detection & Audio Processing: While Gemini has built-in Voice Activity Detection (VAD), some developers prefer LiveKit’s VAD and STT (Speech-to-Text) models for more control over when the AI “listens” versus “speaks” in noisy environments.Recommendation:If you are building a simple proof-of-concept or a local tool, the native API with a direct WebSocket connection is likely sufficient. If your project is intended for production use with many users on varied network conditions, using from the start will save you from building your own WebRTC and session management infrastructure.