This topic has been bothering me for a long time, ever since the release of 3.0 Flash, but with the release of 3.5 Flash it began to worry me even more.
3.1 Pro and 3.1 Flash Lite deliver the first token of thought with minimal latency (10-20 seconds for pro, 2-5 seconds for flash-lite), even with a huge context. But specifically 3.0 flash and 3.5 flash before bringing out the “thinking” tool on the playground just freezes for 3-5 minutes.
I haven’t tested it on lower contexts, but 3.0 and 3.5 flash with a context of 400,000 tokens the delay is about 3 minutes, with a context of 650,000 - about 5-6 minutes.
Has anyone else encountered this? Are there any solutions other than the obvious “start a new chat”? Is this how it’s supposed to be, or is it a bug? I’ve heard that API users in other services don’t encounter this, so it seems to me that this is a problem with the platform itself?
However, when transferring the same context to the Gemini app itself, I don’t encounter anything similar![]()
Hello @Truans ,
Would it be possible to share the prompt and any attached files you’re working with? A Google Drive link or similar would work great.