Exploring Multi-Modal AI: Insights from Recent Tests on AI Studio

Hi,

Today, I tested multi-modal inputs on the model and have some observations and feedback to report.

The experiment involved testing the AI studio and Gemini 1.5 Pro model’s vision and logic capabilities.

Here’s how it looks:

The model works, but here are the things that immediately stood out:

  • The unsafe content warning appeared on all outputs with the default safety settings.
    Screenshot 2024-04-26 at 11.44.51 AM

  • Two out of three outputs were cut off mid-sentence as the model began to describe the animals. I couldn’t find any settings for max output tokens, but the get code button reveals it’s set to 8192.

  • In the last output, I initially thought some incorrectly encoded characters were present, but it turned out that the emojis aren’t being rendered.

  • Towards the end, I also noticed that the ability to add model messages was absent. This means we can’t try few-shot prompting.

8 Likes

Flagged to the team. thank you

4 Likes