Hi - I am exploring Gemini 1.5 Pro’s capabilities. I noticed that sometimes it provides incomplete (invalid) JSON output. I have tried to instruct it to drop the elements in the response that are deemed unsafe but it ignores (as far as I can tell) this instruction type.
Any clues?
Welcome to the forum.
Are you using the method described here Generate JSON output with the Gemini API | Google AI for Developers to supply the JSON Schema? Or are you supplying the JSON Schema in the prompt? The first method is preferred for Gemini 1.5 Pro.
Hope that helps!
Hi - I am using using the preferred method (i.e. not in the prompt).
Maybe if you post an example of malformed JSON the model gave you we might get some insight. Scrubbed of any sensitive information, obviously. The model should follow the instruction when provided in generation_config
.
I am using (for now) AI Studio. Here is what I get for the prompt “RFID chip”:
{"capabilities": ["wirelessly transmit data", "track objects", "identify objects",
I have an idea. The safety filters might be acting up. That would explain the output abruptly stopping. Try it again, but move the safety settings to block few or block none.
The prompt produced decent JSON when I tried it
{
"capabilities": [
{
"category": "Identification",
"description": "Provides unique identification for objects or individuals.",
"examples": [
"Inventory tracking",
"Access control",
"Passport identification"
]
},
{
"category": "Data storage",
"description": "Stores a limited amount of data related to the tagged item.",
"examples": [
"Product information",
"Medical records",
"Asset maintenance history"
]
},
{
"category": "Tracking and location",
"description": "Enables real-time or periodic tracking of tagged items.",
"examples": [
"Supply chain management",
"Asset tracking",
"Patient monitoring"
]
},
{
"category": "Authentication and security",
"description": "Verifies the authenticity of tagged items and prevents counterfeiting.",
"examples": [
"Product authentication",
"Document verification",
"Secure access control"
]
},
{
"category": "Wireless communication",
"description": "Communicates wirelessly with RFID readers to exchange data.",
"examples": [
"Inventory updates",
"Data logging",
"Payment processing"
]
},
{
"category": "Sensor integration",
"description": "Can be integrated with sensors to collect and transmit environmental data.",
"examples": [
"Temperature monitoring",
"Humidity tracking",
"Motion detection"
]
}
]
}
Yes I have played a bit with the safety settings but I am not willing to compromise on those. I consider the model returning invalid JSON as a bug: it should at least return valid JSON if it is not capable of filtering the unsafe content IMO.
If we agree that the root cause of incomplete JSON output is that the model sometimes triggers its safety settings (which can be verified by observing the little triangle near the top of the model response), I would suggest (a) modifying the topic heading to have …when safety is triggered instead of …sometimes, and (b) adding the bug tag.
This is a situation where the model is caught between two competing and contradictory requirements:
- Traverse the provided schema and fill in leaf nodes with content per user prompt; repeat 5-7 times
- Do not show content that violates one of the safety settings
It obviously prioritizes the second requirement, which results in the abrupt ending of the content generation.
When describing buggy behavior, it is a good idea when users also provide what they expect the model should do instead. For example, continue generating and substitute content that does not trigger safety settings, until the generated output is syntactically correct. Or, trim generated content to the last syntactically correct output that excludes content that triggers safety. These are just ideas, you should specify what you wish the model behavior should be.
Hope that helps!
After testing many other ways of providing instructions to remove unsafe content but continue generation with safe content, I have found none that work. Is there a way to file a bug report on this ?
In AI Studio, the three dots in the upper right corner - then send feedback.
Attempting to convince the model to not generate content that may trigger safety settings is futile. We all tried and failed.
Many thanks for this insight! I still believe we should at least expect to have well formed responses
I’m using APIs not studio, and have tried so many things. The problems I get is JSON not escaping characters so the JSON structure breaks. I’ve tried instructing Gemini to not use quotes, to escape characters, and it isn’t consistent and just breaks often. Basically my responses break about 10% of the time due to this.
Although probably not related, but the Python code I get from the “Get Code” function in Studio contains malformed statements related misuse of quotes.
E.g. required = "["capabilities", "input", "entity"]",
I’ve also been struggling with this.
I found that passing an example response helps to some extent. And I also provided the shape(schema) of the JSON .