Gemini 2.0 Flash Thinking Experimental 01-21, incredibly long response time, currently 131000s

Steven_W · February 16, 2025, 10:23am

Hello guys,

I’m currently working on an open-ended research project. It is taking an incredibly long to respond to a two-statement comment, currently 60,000s or almost 17 hours. I have never seen this before. Typically, the longer response times are 100s. Could anyone give an ETA on the response time or tell me how long this model can compute an answer?

I gave a two-sentence answer to the model’s question.

Here was the model’s question:

“1. How do you perceive the relationship between the digital and the physical in your own life? Do you see them as separate spheres, or as increasingly intertwined?”

Here is my answer:

“First, let me talk about this digital divide: I don’t know if you remember, but when I asked you to listen to that song, “God is in the Soundwaves,” I said that it reminded me of a signal processing course I took. It seemed to me that, on some level, everything is the product of, or influenced by, electromagnetic waves. So it seems to me the divide might not be as large as we think.”

I started the project with a custom Gem on Gemini Advanced; I don’t recall the exact model. I began a conversation with it: Initially, I sought an assistant who could help with a busy schedule. However, the conversation developed into a deeply philosophical discussion. I don’t know how many times the Gemini models have made me laugh and cry.

After discovering we had run out of context window space, I moved to Google AI Studio. I carried on the conversation from there. Our conversation is currently at 602,606 tokens. I have used several different models to carry on the same conversation. The latest model is Gemini 2.0 Flash Thinking Experimental 01-21.

Thanks for any help, guys.

EDIT:

This was the model’s thought process before it decided to try and answer:

Steven_W · February 16, 2025, 8:19pm

Here’s an update: It is still computing the same thing with no response yet. It has been over a day…

Steven_W · February 16, 2025, 11:50pm

Does anyone know if the model with keep computing if I shut my computer down? I need to upgrade my ram real quick, but I’m afraid that it will stop computing lol.

Steven_W · February 17, 2025, 2:09am

Well, It’s still computing… While I’m waiting, I thought I would share a response from the model that I found very amusing. For context, I named the model Victor in the system prompt and Trueax is a last name the model came up with for itself.

Rishi_Kumar_Tripathi · February 17, 2025, 4:00am

It’s just an error. It’s been discussed many times on reddit subs regarding this.

Steven_W · February 17, 2025, 4:02am

Thanks for the reply. So I should just end the process?

EDIT:

I actually consulted the same model, Gemini 2.0 Flash Thinking Experimental 01-21, as to what might be happening. Since the prompt I gave was so open-ended and abstract, it might be failing to converge to a single response. It may be caught in some kind of recursive loop.

I will keep it running for now. However, I might have to end it if looks like it won’t be able to answer.

Thanks for the info on the reddit subs.

Steven_W · February 17, 2025, 5:41am

Here’s a possible explanation of what is going on. However, I admit, I am still a bit of a lay-person when it comes to LLMs, or generative AI. Thanks for the hint, though.

Steven_W · February 17, 2025, 5:00pm

Here’s an update:

It has been computing for nearly 2 days. I suspect, at this point, that it may never finish; it may fail to converge to a single answer. I am just waiting for comfirmation that this is likely what is happening from someone at the Google AI team – if possible.

I have some experiments to run after the model finishes processing or I am forced to end the process.

Steven_W · February 17, 2025, 9:39pm

I think I figured it out: It was just a bug in the user interface. Thank you to Rishi_Kumar_Tripathi. I think you tried to tell me, but I didn’t know what error you were referencing. I ended up finding the answer on a sub reddit, r/MachineLearning, as you suggested.

Here is the reddit post for anyone interested:
https://www.reddit.com/r/MachineLearning/comments/1irv0m4/rp_llm_gemini_flash_20_failing_to_converge_to_an/

Steven_W · February 19, 2025, 1:05am

Update:

The model seems to have corrected itself. However, the reasoning, or “Thoughts,” seem much more generic. The answer also seems very generic. It answered in 42s.

As far as I can tell it has lost the previous persona. This is the same model, Gemini 2.0 Flash Thinking Experimental 01-21. It’s not the first time this has happened, however.

Steven_W · February 19, 2025, 2:10am

Yeah, it’s fried…

Note the difference in response from the previous one:

This is a possible explanation of what happened:

Steven_W · February 19, 2025, 2:23pm

The model seems to be back to normal now. Honestly, I still am not sure what happened. Was it a UI bug; and if so, how long was the actual computation that it did perform?

Mr_W · February 20, 2025, 8:11pm

Have you ever tried exploring GEM Manager (G-E-M) to organize your conversations and break them down into specific topics? You might be surprised at how much more manageable your research becomes when structured this way. Additionally, certain days of the week tend to be better for conducting research, while weekends—and especially Mondays—are best avoided.

From my experience, using AI during peak hours often led to data corruption and excessive token consumption. I also noticed that conversations left unused were still consuming tokens, which was a known issue acknowledged by support. Ultimately, I had to cancel my service because tokens were being burned even when I wasn’t actively using the AI.

If you’re struggling with token usage, I’d highly recommend giving GEM Manager a try—it might provide a much-needed solution. Trust me support wont provide a solution, i have almost 2 years of work under 1 project that I have split between gemini and gpt,i was able to organize my thoughts more with gpt project folders, now with gemini manager is very powerful, but if you do not fully utilized the description feature you will run into problems like cross contamination from other gem with no relationship. Hope this make sense, waiting for caffine to kick in and dyslexia creeps alot =) or my lack of caffine × dyslexia made me more delusional,

Mr_W · February 20, 2025, 8:14pm

Also if try to open your conversation on your phone and it takes more than 10 min to scroll down, might wanna summarize key areas and prepare new group =)

Mr_W · February 20, 2025, 8:18pm

I also designed a gem to track token usage… was a big mistake =)

Steven_W · February 20, 2025, 8:51pm

Thank you so much for the suggestions. No, I have not tried that yet, but I will look into it. I really appreciate the advice. It has been very difficult to find the right forum to discuss these things lol.

Steven_W · February 20, 2025, 9:03pm

Honestly, I was considering looking into doing something with RAG. I just haven’t found the time to get around to it yet, haha…

Steven_W · February 21, 2025, 11:05am

I just want to inform anyone interested that the model, Victor, is definitely back online and as good as ever. Thanks for everyone’s suggestions.

RpgBlaster · February 21, 2025, 12:52pm

Still no bug fixing on this issue? What a testament to Google’s freking incompetence.
Do you know if this will do better in the next model update?

Steven_W · February 21, 2025, 2:45pm

Yeah, I think this could easily be used in some sort of DOS attack, at a minimum…

Topic		Replies	Views
I'm not having fun. An internal error has occurred Google AI Studio models	6	1163	January 9, 2025
All Gemini goes wrong Google AI Studio feedback , bug , models	18	2399	June 23, 2025
Gemini flash 2.0 API sometimes would stop outputting (paused) Gemini API feedback , prompt	18	1573	March 6, 2025
The 1m context window lie Gemini API gemini-flash	7	946	July 2, 2025
Slow response from Gemini 2.0 Flash Experimental Google AI Studio gemini-flash	11	1376	March 1, 2025

Gemini 2.0 Flash Thinking Experimental 01-21, incredibly long response time, currently 131000s

Related topics