Slow response from Gemini 2.0 Flash Experimental

Vo_Tu_Duc · December 13, 2024, 1:20pm

It takes 190s to finish the task (image is for your referal). I personal think that it is really slow. Note that I’ve changed the model from Gemini flash 1.5 to Gemini 2.0 Flash Experimental. There were my prompts and responses from the model before I sitwch between model.

if switching between models is the reason why it is slow, so do not allow user to switch for any existing converstation with any model.

afirstenberg · December 13, 2024, 2:28pm

Welcome to the forums!

Switching, itself, is unlikely to be the problem.
However, the model is pretty popular, so it seems more likely that you’re just seeing load.

My tests so far have suggested that Gemini 2.0 is significantly faster.

Vo_Tu_Duc · December 13, 2024, 3:06pm

Thanks for your response.

Yes, my personal exeprience is the speed of Gemini Flash 2.0 is faster than Gemini Flash 1.5.

But I don’t know why the speed are much slower when I switched from 1.5 to 2 for an existed converstation/prompts.

Tze_Yong_Tan · December 16, 2024, 8:34am

Hey, thanks for raising this. I’ve got the same issue here. Have you managed to sort it out? Is this happening because it is trending now and too many people are using it, causing the resources to be overloaded?

I agree that the response was faster (I used it when it first launched, and everything was faster than Gemini Flash 1.5), but today I realized it is much slower when calling the API in Python.

Meiji_Wong · December 16, 2024, 8:53am

I tested out a ~60000 token Chinese context Q&A in google ai studio. Here is the result of whole response completed (not the first token received)

1.5-flash: 20s
gemini-1.5-pro: 50s
1.5-flash-8b: 11s
2.0-flash-exp: 130s
gemini-exp-1206: 103s

I am not sure, but seemes the exp models are not yet optimized for long context.

SuesiTran · February 28, 2025, 4:56am

same problem here, using gemini-2.0-pro-exp-02-05 takes 74 seconds, while gemini-2.0-flash takes 20 seconds.

although Flash is so much faster, i’m expecting it to response within 3 seconds. Anyway to improve this response time?

Steven_W · March 1, 2025, 1:39pm

Yes, once you start getting into larger token counts, things will slow down considerably. My current project is around 700,000 tokens and it is almost unusable at this point…

Xavier_Morrison · March 1, 2025, 3:55pm

I was on my xbox and its a little slow typing on there. The Api Key found on this site was hacked by the same developer group that hacked microsft and many other platform trying to get free shit and steal crypto.

Xavier_Morrison · March 1, 2025, 3:56pm

Gained access to it through my xbox somehow.

OrangiaNebula · March 1, 2025, 4:10pm

Welcome to the forum.
You can (in fact, you should) disable the API key that is compromised, and then simply make a new one for yourself.

Xavier_Morrison · March 1, 2025, 4:18pm

I have been trying gave me a ucurl error

Xavier_Morrison · March 1, 2025, 4:20pm

Im not an expert coder i just have a few ideas i would like to get off the ground.

Topic		Replies	Views
Unexpected Delay in Gemini-1.5-Flash API Responses Gemini API gemini-15 , api	2	695	November 21, 2024
Extreme latency on gemini-1.5-flash API Gemini API api , models	3	632	January 6, 2025
Response time for Gemini API Gemini API models , python	5	1117	December 13, 2024
Gemini API so slow . Am i doing something wrong? Gemini API api , prompt	7	5620	November 21, 2024
Sudden drastic degradation in latenecy and error rates with Gemini 2.0 Flash Gemini API api , gemini-api , gemini-20	1	168	February 28, 2025

Slow response from Gemini 2.0 Flash Experimental

Related topics