Hi,
Today, I tested multi-modal inputs on the model and have some observations and feedback to report.
The experiment involved testing the AI studio and Gemini 1.5 Pro
model’s vision and logic capabilities.
Here’s how it looks:
The model works, but here are the things that immediately stood out:
-
The unsafe content warning appeared on all outputs with the default safety settings.
-
Two out of three outputs were cut off mid-sentence as the model began to describe the animals. I couldn’t find any settings for max output tokens, but the
get code
button reveals it’s set to8192
. -
In the last output, I initially thought some incorrectly encoded characters were present, but it turned out that the emojis aren’t being rendered.
-
Towards the end, I also noticed that the ability to add model messages was absent. This means we can’t try few-shot prompting.