So I made an app to judge your submission. It uses the images to judge your execution, so it doesn’t have to watch the entire video. Show the 3 best examples of your submission when you create a submission.
For creativity its also factoring in existing projects submitted and released in the market at large - although im working on this still.
Project used was just a project i really liked. If this is yours I can remove it. Also the scores in the screenshot are stale from earlier development.
I adjusted the algorithm to follow the competitions rubric as well as check if theres no images to decide on the software practices. Every project will get rejudged when a major update happens like this.
Hey man,
Great work! I had a question - does it factor in the other submissions as well and give a relative score of sorts? Or is it completely just rubric based in an objective manner?
Curiosity got the better of me, I registered, filled out the form and paid $1.50 for a submission, but I cannot see my project. " You have no submissions. Add one!" Can you check on it? This is my project, the only thing is that I created three composite screenshots about the app (screennshotted 9 frames from the YT video). Did the submission and the Stripe payment arrive to the back-end?
Tried one more submission, this time from Chrome (other was from Firefox), same thing. The submission purchase opens a new window, the transaction happens, but the original submission selection matrix window remains there as is after the transaction is done and the Link page remains open as well. One thing is that my Stripe account is with my Outlook email (I have production apps (investment projection web app) which uses Stripe, so I have accounts) and not the Gmail, but that shouldn’t matter.
I have a retriever that gets similar entries but I’m still working on it, I have it give a score based on the rubric then I also have it go through and fine tune creativity and execution further. Creativity will factor all similar entries.
I’m curious about the technical details: what modalities you use for grading (is it only the texts + image or is there a way to also factor in the video) and how do you prompt?
Good question so yes text and images. I was also going to use the YouTube API to get the video transcript but it really doesn’t offer more than images or text since it is inherently just more text (if any! -some have no sound). It’s just a berry that doesn’t seem worth the squeeze. There is also more time involved if I do this video feature. It’s also not very popular right now, if more people don’t submit within the week I’ll just make it a startup validator app and share it on other forums for people to use.
So Im looking at your submission, I think it’s confused about what your ui is. To be honest I also don’t know what’s your actual app vs what is the hardwares software. That may be where you lose in execution. I think Omni would have a better opportunity here since I think they actually have a Mac app that can be seen, however it might confuse it with another chat app, not sure, it’s a mind of its own in a sense and judges are often hated for not seeing what we participants see.
One more thing, your description weighs heavily initially and in the fine tune it takes the visual into account with the description of how it works. Very important to have a detailed description.
That’s super valuable feedback. If you find that the case the actual judges may have the same problem as well. I thought that it’s somewhat clear that my submission is a software, and it’s hardware agnostic: can be a watch (FAW - Full Android Watch right now, can be bought on AliExpress for $50), a phone, or other mobile device (some packages I use may not cover desktop).
The grading may not realize that I montaged multiple screenshots onto one image?
I’m reading your answer again. The UI is very simplified due to optimization to such form factors as a watch. I also cut the video to save time. I think I can also lose points by not being original enough. I’m chasing AI pins, and during the development - before the hackathon submission - Made by Google '24 happened where Gemini Live was introduced. Gemini Live is somewaht of a competition, although it doesn’t have on-device vector DB and RAG. Their UI is obviously smoother and better designed. If I had some friends along we’d have more time for a better UI.
The reason my video is like that because I wanted to demonstrate the multi-modal functionality when I show something to the watch’s camera and ask about it. Which turned out to be quite challenging given I had to handle the 1. the watch, 2. my phone for recording, 3. the prop. This could be confusing maybe? I realize now I could have had a screen recording, which would be sharper, although you might not have seen how I handle the props.
I wanted to venture out to a grocery store to ask about items on the shelves, or navigate, but I didn’t have time for that. Originally I was thinking about a custom watch hardware along the line of AI pins, which inspired me. AI pins effectively run APKs. And then in the process (when two GDEs did not have time for commitment to join) I realized that my own hardware would be too big of a task and realized I’ll focus on the APK only.
I still keep developing it because I’ll use the app for my own amusement. I’ll look at Omni, I’ve heard about it in other threads. I saw in a YT list that there’s Omni 1 and Omni 2?
I checked out your app, and it definitely has a solid UI. I’m not sure why the AI couldn’t recognize it, but overall, great work! I think it’s a well-made app. How long did it take you to develop it?
By the way, it’s Omni 1. Feel free to share any feedback with me here if you’d like:
Great job on the website! I haven’t tried it myself yet, but it looks really interesting. I can tell a lot of effort went into creating it; well done! You could consider using different LLMs to get an averaged assessment for improved accuracy and less biased, though that might be a bit over the top.
Interesting. Wouldn’t it be something if Gemini has a bias towards itself .
I did this mock with claude3.5 sonnet initially which gave me extremely high scores, like 62/65. So maybe Gemini is less impressed bc it knows its own capabilities.
if not many more people enter by this week I’ll do the video transcripts feature manually for each submitted project as it’s way faster…
Oh, the UI is very rudimentary. I looked back at the repo (GitHub - CsabaConsulting/InspectorGadgetApp: Open Multi-Modal Personal Assistant) and it seems I started on 6/13 by feeling out my first VGV CLI scaffolded project. And for a while I was rolling with Cubits. I certainly spent many weeks on deciding the right direction, which proved to be very valuable: I ended up deciding that my demo device would be a FAW (Full Android Watch) not so long before the repo foundation. Otherwise I’d have drowned in developing my own hardware platform. Which I might still do later, I have plans, but it was good to focus on the software only.
Seriously it is a wahh moment man Kudos to you . please try to participate in some other competition and hackathon so that you will reach great heights.