Gemini Live Flash 3.1: Best practices to accommodate poor network quality?

I’m using Gemini Live (Flash 3.1) over WebSockets for an Android voice app. I’m facing a specific “silent failure” issue:

When the network is slow or has high jitter, the WebSocket remains connected and no errors are thrown, but speech is not recognized at all. On a stable network, it works perfectly.

Current Setup:

  • 16 kHz, mono, 16-bit PCM.

  • Audio frames are sent immediately after capture.

  • Basic reconnect logic (only triggers on socket close).

Questions:

  1. Are there recommended buffering or smoothing strategies (e.g., specific chunk sizes) to handle unstable throughput?

  2. Should I consider moving to a different transport to mitigate this?

Any insights on making the audio stream more resilient to “slow-but-connected” networks would be appreciated!