Hi everyone,
I’m developing an Android app prototype to recognize LEGO sets from a photo of the box or the assembled set.
The prototype works well, but mostly with older LEGO sets.
With newer sets, the identification becomes inaccurate or doesn’t work at all.
However, I’ve noticed that if I upload the same images to Google Search AI Mode in the browser, it recognizes the sets correctly.
My question is: shouldn’t the Gemini API use the same visual recognition capabilities as Google’s AI Mode?
Are there known differences between AI Mode in Search and the Gemini API?
Any suggestions on how to improve recognition accuracy or which model would be better suited for visual matching tasks like this?
Thank you very much!