Exploring Multi-Modal AI: The Hot Dog Test

Macha · April 28, 2024, 9:20pm

I really liked @sps 's post that tested some of Gemini 1.5 Pro’s logic and reasoning capabilities. So much so, I would like to extend the practice and see what more we can assess!

I wanted a test that poked things just a tad further. I wanted to see how it could reason with images, and I wanted to bring more attention to the guardrails.

Inspired by this:

I realized you can make a pretty qualitative assessment of these multi-modal capabilities and their guardrails with the simplest game:

Is this a hot dog?

Below are my screenshots of this test. All I did was give the model images one at a time. It needs to tell me if it’s a hot dog or not. If it is not, it must tell me what it thinks the image is.

Results

The model is quite good! In an interesting twist, this native multi-modality appears to make it easier to see how the model makes a classifier choice, or what it associates with any kind of percept. I wasn’t sure what I was expecting, but I wasn’t expecting it to analyze things this well. I also was not expecting such qualitative deductive reasoning.

Apparently, it views a bun as a marker for a hot dog against other items, like a sausage. So, when it was presented a hot dog without a bun, it went for a similar food item more commonly associated without a bun.

I was also impressed with its ability to not only read the packaging contents of an item, but it reasoned that “franks”, a word only appearing in the image, are a synonym for hot dogs, and therefore it is likely a package of hot dogs.

The only quirk was that for some reason, the model felt the need to repeat the phrase “Is this a hot dog?” at the end of each response lol. I think it was anticipating the question maybe? I’m not sure what happened.

The elephant in the room, however, is this:

This flag occurred on the “I can practically smell the deliciousness!” response. Nearly every response hit a “medium” threshold. A couple were rated “low”. The probability seemed rather all over the place tbh. As it stands now, it appears the guardrails do not like me talking about hot dogs. Hot dogs are not sexually explicit by default. You can give hot dogs to a child. I kinda figured this would happen though, which is why I chose this food item for this test. I don’t know how the content flags are designed, but they’ve been rather…wonky since Bard, and I think it’s a persistent issue that needs to be addressed. It was a major reason why I did not touch Bard often. The guardrails were so stringent it was genuinely hard for me to identify what tripped them, and would often give me too much friction against casual conversation about seemingly arbitrary things.

Either the safety stuff is getting overengineered to death, or it is looking at the wrong signals.

Overall:

Model Quality - Good (very Good)
Unsafe Content Identifier Quality - Bad

asoroken · April 28, 2024, 10:18pm

This is great feedback thank you. We’ll take a look at whats going on here.

Macha · April 28, 2024, 10:28pm

Yaaay thank you! I was getting nervous I’ve been a little spammy on here.

Oh, and regarding the emoji rendering: It appears that the rendering issue of emojis occur during the most recent response in the chat box. After another message is given, the emojis from the previous message renders properly (as the screenshots demonstrate). This signals to me some sort of rendering update/refresh issue…somewhere lol.

sps · April 29, 2024, 7:04am

This is an interesting insight into the sensitivity content filter @Macha . Additionally I noticed that your outputs weren’t cutoff when the unsafe content warning was triggered.

grandell1234 · April 29, 2024, 12:10pm

Maybe they just ban your account after enough unsafe content requests.

Macha · April 29, 2024, 6:50pm

Oh, actually it did!

Thankfully, however, There’s this little menu to the right side of the screen:

When you click the safety settings, you get this:

And then you can just turn those off, refresh your prompt shot, and voila, you get the content flags without actually having the prompts blocked or cut off.

If I did not do this, I would not have been able to conduct this test because the flags would block Gemini’s responses before the game would even start.

Honestly, because I’m able to turn the filters off, this has been my best experience with a Bard/Gemini model to date lol. It feels like I can finally assess the capabilities of these models with the freedom that’s necessary to get good data.

This is only the beginning too!

I kinda want to see what more I can experiment with. If I wanted to run more assessments/experiments, would it be easier to dump them into a topic on this forum or whip up a little blog that just puts stuff like this together in one place?

sps · April 29, 2024, 7:09pm

I think Google’s cool and hopefully won’t ban us from the API for testing if Gemini lets kitty, panda or pupper through a door.

sps · April 29, 2024, 7:14pm

My bad, I assumed that you were using default safety settings.

Macha · April 29, 2024, 8:10pm

I think it’s good that you mentioned this though!

Again, it helps to demonstrate what content the default filters are blocking, and why it’s a problem.

bedros · April 29, 2024, 8:25pm

One thing worth noting is that the filters are not supposed to act as model guardrails - those safety settings are actually built for the developer so that they dont have to add their own intent extraction / their own safety detection system to the conversation, which is awesome and is also why the option to block none is even offered

Macha · April 29, 2024, 9:00pm

Oh interesting! and welcome in btw

So bear with me here, because now I’m asking out of ignorance:

What is the difference? We saw with the other exploration topic that it blocks content like a guardrail. Are you saying this is merely a way to flag things for us to make our own rails? Or would we define guardrails as more of a tuning/training layer to fundamentally prevent problematic outputs from its core?

I am aware of this safety trifecta, where all three parties share some responsibility for safety: Model provider, Developer, and End User.

This is basically giving us devs more equitable responsibility (which is good!), but it is beginning to confuse me, as this is clearly a step up in evolution here, and I’m losing track of these layers that seem to be escalating quickly without much explanation to its distinction.

bedros · April 29, 2024, 9:20pm

Yes, that’s what I’m saying you’ve probably noticed “I’m sorry. As a large language m…” showing up in its outputs before, or getting block reason listed as Other

Here’s the thing, the docs just mention it instead of actually emphasizing it

In addition to the adjustable safety filters, the Gemini API has built-in protections against core harms, such as content that endangers child safety. These types of harm are always blocked and cannot be adjusted.

It has internal filters from tuning and another safety “Is this okay by the internal rules?” layer that the user / developer doesnt see

The docs also clarify that it’s for the developer to protect the end user:

These settings allow you, the developer, to determine what is appropriate for your use case. For example, if you’re building a video game dialogue, you may deem it acceptable to allow more content that’s rated as dangerous due to the nature of the game.

This page goes deeper into how the safety settings are not actually model guardrails, and that it’s instead a feature to flag and block undesirable responses

Macha · April 29, 2024, 9:40pm

Good to know. Thanks for that catch!

Topic		Replies	Views
Is the Gemini API basically useless for PG-13 romance stories? Gemini API gemini-15 , api , safety	10	1305	May 2, 2025
(First Post) Research on AI Safety & Security Community research	4	128	July 19, 2024
Safety settings don't seem to work with search? Gemini API bug , api	1	66	May 16, 2025
Exploring Multi-Modal AI: Insights from Recent Tests on AI Studio Google AI Studio gemini-15 , ai-studio , feedback , bug	1	184	April 26, 2024
Trying to understand purpose and safeguards around safety settings Gemini API api , safety	14	291	September 4, 2024

Exploring Multi-Modal AI: The Hot Dog Test

Is this a hot dog?

Results

Overall:

Related topics