Gemini 2.5 Flash audio response latency doubled / tripled after 3.0 Pro release?

Anyone experiencing this?

Audio (non-streaming) use case. No code change on our side and config has thinking budget set to 0. Not sure what’s the exact date for the regressions, but starting in late November, right after 3.0 Pro release.

Can Google folks take a look? Thanks!

1 Like

Hey @Richard_Wong,
yes, we are experiencing the same behavior. When sending all the audio at end_of_speech, the response shows a noticeable delay.

In this post I shared a small workshop where I explain the issues, pros, and cons of using audio streaming vs. end_of_speech, in case it helps anyone as an example or reference while debugging this behavior:

Hope it’s useful!

I’m experiencing exactly the same issue. It works perfectly with text input, but only audio inputs are affected.

The instability is so severe that I’m seriously considering switching to OpenAI’s API instead.

Hi @Richard_Wong, apologies for the delayed response.

My understanding is that you are trying to get an audio response for your query using the 2.5-flash model and facing latency issues in that.

To understand the issue better, could you please elaborate on what you are trying to achieve and possibly share a snippet of code, so that we can take a look?

Thank you!