Hey everyone,
I’m using the Gemini 2.5 Flash Native Audio model for a live voice chat feature in my app (real-time bidirectional audio via WebSocket/bidiGenerateContent).
I noticed that `gemini-2.5-flash-native-audio-latest` currently resolves to the same `12-2025` preview version - confirmed by checking the model metadata via API. There doesn’t seem to be a newer version available yet.
The issue I’m running into: the model’s knowledge appears to be cut off somewhere around late 2025. For example, when asked “Who is the current president of the United States?”, it confidently answers “Joe Biden” - which is obviously outdated.
A few questions for the community:
-
Has anyone heard about an upcoming update to the native audio model? The regular `gemini-2.5-flash` (text) seems to have more recent knowledge.
-
Is there a way to provide real-time context/grounding to the native audio model to compensate for the outdated knowledge? I’m already using Google Search as a tool for the text-based model but not sure if it’s supported in the bidiGenerateContent flow.
-
Are others experiencing similar knowledge cutoff issues with the native audio preview?
Would appreciate any insights. Thanks!