Gemini 2.5 Flash doesn't have audio processing capability, but why?

NguyenfromVN · June 1, 2025, 7:35pm

I was testing my work, and suddenly the Gemini Flash 2.5 responded in a very weird way, saying that it can not process audio. At first, I couldn’t believe, as audio is a very basic form of data that most of the Gemini models should handle. But after testing more and more, including via API call and using UI from Google AI Studio, it’s TRUE, and it’s TRUE. What!? why 1.5 Flash, 2.0 Flash Lite, and up to 2.5 Pro, all can understand audio EXCEPT 2.5 Flash (both 2 released preview versions)

NguyenfromVN · June 2, 2025, 3:55am

New update: sometimes it say “Yes, I can” and sometimes it say “No, I can’t”. What I think: Google Gemini models even tho got advertised as native multimodal processing capability, what I am believe is that it’s actually calling other internal tools at Google side to achieve this “multimodal” requirements (the outcome is still considered as an actual multimodal capable AI model, but not so “native” to me), and if the video/image processing or audio processing is down, then it will loose the corresponding data type that it can “understand”. What do you think?

GUNAND_MAYANGLAMBAM · June 4, 2025, 8:41am

Hey @NguyenfromVN , I have been unable to reproduce the audio processing problem you encountered. According to the official documentation, m4a format isn’t supported.
Could you try using a different MIME type and see if the issue persists?

NguyenfromVN · June 4, 2025, 9:03am

Hi, I think I know the reason, it’s not about m4a file, Gemini 2.5 Flash does understand this one, but sometimes it say “can process audio” and sometimes it say “can’t process audio”, this show something wrong happened at Gemini’s side when its audio processing sometimes not ready to use and caused this such response, so this issue is not consistent and hard to reproduce. Thanks for your response, I hope Google team can make the availability of the internal audio processing better, that’s all I wish and want. Have a nice day!

Topic		Replies	Views
Gemini 1.5 refuses to process audio files Gemini API gemini-15 , api , web-ml	8	564	September 19, 2024
Gemini-1.5-flash is no longer processing audio files (500 Exception) - retry does not help Gemini API gemini-15 , bug , models , audio	4	135	April 9, 2025
Elevated error rate (>95%) with `gemini-1.5-flash` when processing audo Gemini API api , audio , gemini-flash	4	126	April 10, 2025
More audio file type support in (openai-compatible) api? Gemini API audio , openai_compatibility	4	352	June 13, 2025
Call to update documentation for Audio Understanding (Refer to timestamps) Gemini API audio , gemini-20 , documentation	1	96	May 31, 2025

Gemini 2.5 Flash doesn't have audio processing capability, but why?

Related topics