Why do Gemini 2.0 Flash models struggle parsing information from a html page

Cryp70 · February 11, 2025, 6:39am

I’ve been trying to get Gemini 2 Flash and Flash thinking to read a html page and urls embedded but it struggles so much to answer questions I have about the information correctly even though the info is on the page. It says it can see the urls, it also says it is clicking the urls to double check but still get’s wrong results. For example I ask it to give me the number after “/editions/” in the following html that contains a url “Dricus du Plessis” and it might get that correct and return 429, but for others it get’s it completely wrong and finds numbers that don’t even exist on the page. Is there a way of improving that output so there are no errors?

Cryp70 · February 11, 2025, 9:54am

I’ve never used a dumber AI for real, wasted hours on getting the wrong results every time and me having to spoon feed it the correct answers

Cryp70 · February 11, 2025, 10:27am

Ok so this Gemini Pro even failed to read the html of the page correctly when I give it the url as well. It did work once I copied the full source code of the page and pasted that for the AI to use instead of just giving the URL. So why can’t it read a html page correctly without having to do that? It makes out like it can read it only to return hallucination results

Cryp70 · February 11, 2025, 10:40am

At least the AI owns it’s stuff ups

…seriously though if it is able to read part of page when supplied a url it obviously has the capability of reading the entire html, period. It’s just making a mess of it, completely fabricating fake results unless I give it the entire source code of the page myself. What’s up with that? Either do it properly, or say you can’t do it, don’t send a bunch of weird stuff that isn’t even on the page… wak

Jami_Bailey · February 11, 2025, 8:33pm

Not sure about your specific example but I scrape pages all the time and have them processed by Gemini with incredible accuracy. I scrape myself though and feed Gemini the data directly versus having Gemini look (please elaborate specifically what you mean by this as I believe this is where your problem more then likely is).

Jake_Carr · February 13, 2025, 5:27pm

I’ve been trying it with a complex prompt that provides some context formatted in XML, and it doesn’t do a good job handling that either. Otherwise I have been impressed with it.

Topic		Replies	Views
I'm not having fun. An internal error has occurred Google AI Studio models	6	1210	January 9, 2025
Gemini Flash 2.0 is useless? Google AI Studio models , gemini-flash	5	1226	December 23, 2024
Stupidity gemini 1.5 flash Gemini API gemini-15 , feedback	6	515	June 28, 2024
Gemini flash 2.0 API sometimes would stop outputting (paused) Gemini API feedback , prompt	18	1645	March 6, 2025
Gemini-2.0-flash-thinking-exp-01-21 and gemini-2.0-pro-exp-02-05 feedback Google AI Studio models	1	325	May 22, 2025

Why do Gemini 2.0 Flash models struggle parsing information from a html page

Related topics