AI Studio crashing - milions of DOM <span>

The AI Studio crash on very large prompts. The issue is someone badly decided to put every letter in and on top of that made some error with leaving html comment next to each one.

This generates MILIONS of DOM nodes which affects performance of AI studio, making the browser crash. One can stop this, by manually deleting of what just has been prompted in. Then the app runs smoothly.

Please fix that, as this is obvious rookie mistake of whoever decided it would be good idea to add each letter of prompt as a . I understand it was done to “animate” the prompting, but there are so much more better ways to do it. Eventually end user could decide to toggle “performance” mode on, and disabling the unnecessary animation, and just waiting for answer to complete.

7 Likes

People have been complaining that AI Studio gets progressively more sluggish, and enterprising developers have even located the source of the problem @Logan_Kilpatrick

Hey @elkolorado can you share an example of the type of prompt you are using, is it a file upload? Or copy and pasting text directly into AI Studio? I tried both with ~120,000 tokens and it keeps working for me without any crashes. If you have an example that would help a ton!

2 Likes

I am using lore knowledge from MMORPG game called Tibia. If I start with all books inside game, which is around 400k tokens. I am directly pasting it into prompt in form of array of strings without spaces.

Then if I start conversation, it will graduatly go from slow, to unacceptable and eventually crash the browser. Each time I write question manually regarding the pasted content, and each time the gemini will answer, each of that response will be generated for each letter of that, and with extra comment inside DOM as visible.

If, after inserting the prompt I will just manually delete the DOM that contains the prompt I just inserted (via dev tools) it won’t crash as I will essentialy delete all of the mess.

This is what other 40k tokens prompt DOM looks like: (168k span elements inside DOM).
image

If you will eventually start to chat with gemini for a long amount of time, manually writing everything. Each of the prompts letter will be wrapped around . with class.

Each time if you update the DOM with a new node with a new class - each time all CSS inside DOM will need to be revaluated, causing tons of memory to be used.

Not to mention that it is unacceptable for an end user to have 168k span nodes inside his few questions prompt, that advertises it is capable of 1kk token window. 168k span nodes has been generated for 40k tokens.

If I do all books inside the game (400k tokens) and add halved npc knowledge inside game (totaling around 800k tokens) If I paste 800k tokens, and then ask an one or two questions, it will just crash faster. Meaning the longer pasted thing is, the faster it will crash.

The solution is obvious - fix it so it doesnt generate milions of span nodes. Keep it simple - there is no need to have each letter of the word wrapped inside span. Animating the text inside HTML can be done with just css**.

 animation: typing 3s steps(30, end), blink 0.5s 1;
1 Like

Flagged this to the team so we can take a look!

1 Like

Hi! I have a similar issue. What I input in the prompt are 10 medical papers. They are totally around 150K tokens. Gemini Pro 1.5 didn’t crash but took more than 1 min to respond for every subsequent question.

1 Like

I wanted to open another topic to air this. It’s a lot… it starts from slow glitchy browser till it crashes the browser. I’m on linux and resource not so badly off…I just feel pity for a Windows user who pastes in long data with huge tokens. My browser won’t stop crashing, eventually I gave up on the task. The interface is REALLY buggy!

1 Like

Welcome to the forum. It’s been flagged. We got past the ‘it works for me’ part. Hopefully the right people will get to fix it and give @elkolorado a decent amount of brownie points for debugging it for them.

2 Likes

Its been 8 days, has your team provided a first feedback? I have opened a thread on this prior to knowing this thread. I have explained my use case which is a fairly straightforward and simple one and yet there is unbearable lag and slowness at low token usage. This has been happening to me since a month now but it has become worse since the release of new updates to ai studio (gemini flash, etc)

1 Like

Followed up with the team on this. Hang tight! Will update when I have more info.

1 Like

Hey @Tim_Wu this is to be expected, the time to first token and broadly the response time go up with the addition of more tokens. It can also depend on the load on the servers which varies by time of day.

1 Like

Closing the loop on this, we pushed some changes which should fix the span issue and a few other bugs. Please let us know if you run into any other issues.

4 Likes

Appreciate the quick response by the team!

Hey there, great to hear. I will check on both Windows and Mac and let you know.

thanks for the update - is this live now, or shall we wait for release?

I see improvement - spans arent generated for each letter anymore, although the performance is still slow, and it crashes on large prompt as mentioned above.

Furthermore, although the spans are removed, there is still issue with comment nodes,

Somehow, your frontend generates empty comment node - and those affect performance and memory useage of browser as well.

For simple 10 question discussion, there are over 1,5k empty comment nodes generated, that occupy DOM. (the comment nodes are visible on original post), below is example walker that shows all comment nodes in your frontend app

image

1 Like

Flagged to the team to look into this more, thanks for reporting!

1 Like

Hey there, just to confirm, there is still lag, I have tested on Windows 11, Google Chrome. after 7900 tokens, two responses from Gemini Pro, I get lag while typing. @ elkolorado can you uncheck the solution flag for this post as it is not solved yet.

sure thing. Later the day I tested, it was also lagging & crashing. I tried to investigate the issue using memory profiling, but I am not that experienced in it. However, it was showing that whenever the input was generated there is severe large task being made that takes 50-80ms, and it is what generates the lag. It causes layout repaint and other stuff. 80ms for DOM seems like some severe issue. There must be some not ideal logic being made for DOM manipulation, and the prompt array that stores the data. Certainly this means, each time the DOM is manipulated there are multiple events firing every time - and probably not all of them should.

Think about it - if you would just build from scratch MVP chat app, writing in input/textarea your prompt and sending it, and then asynchroniously writing the response answer back to DOM should be smooth as butter.

Just to be on common ground - the backend request - is obvious that you just promise it, and wait for it to be done. And there is no connection between that and frontend performance.

Then you can implement clever shadow DOM for dealing with 1kk token prompts, like rendering the already prompted Q&A as you scroll the chat up/down, to not generate enormous amount of DOM elements, that would be causing severe lag when repainting while adding new elements. I’ve seen methods for dealing with 1kk html table rows scrolling and it was smOOth.

I assume the trouble arose since this is really specific use case of JS, while dealing with enormous amount of text (but certainly after the total tokens goes up, not for simple chat question), so maybe the issue is entangled with the frontend framework that was chosen. If you really think about it the vanillia js(/ts) in that case would perform incomparably better, as one have total control over it, and doesnt need to deal with frontend specific problems like mounting the elements to DOM.

bruh, you should be Google’s technical advisor. Hope you are doing something good wherever you are because you are a hard to find talent.