Hi,
I am currently facing an issue with gemini-2.5-flash-native-audio-preview-12-2025 while using the input audio transcription. The input audio is English speech, but the generated transcription is incorrect. It often returns Hindi characters for English speech or random, unreadable characters.
Here are the details:
-
The input audio is clear, with minimal background noise.
-
The transcription should be in English, but the output often contains characters from a different script (e.g., Hindi) or completely random characters that do not correspond to the spoken words.
-
This issue persists regardless of the audio format or quality.
Steps taken:
-
Added this is prompt to generate the transcription in english
-
Checked the audio quality to ensure clarity.
-
Tried different audio formats (WAV, MP3) and bitrates without resolving the issue.
Has anyone else encountered similar problems or have suggestions on how to improve transcription accuracy? Any advice on settings, configurations, or solutions would be greatly appreciated.
Looking forward to hearing your thoughts!
Thank you!