JSON Schema causes issues with Gemini Pro/Flash

I have confirmed an issue when enabling a JSON schema, which results in the underlying model Pro/Flash producing repeating tokens until the max output tokens are reached.

I’m using the Python SDK (0.8.3) and a simple schema:

class SingleMatchVerificationResult(TypedDict):
    conclusion: bool
    confidence: int 
    reason: str
    confidence_reason: str

that eventually is passed directly to:

response = client.generate_content(
            prompt,
            generation_config=genai.types.GenerationConfig(
                candidate_count=1,
                max_output_tokens=512,
                temperature=0,
                response_mime_type="application/json" if json_mode else "text/plain",
                response_schema=json_schema if json_mode else None
            )
        )

Here is a sample response that shows it repeating itself.

I can share plenty of prompts and provide code that will help reproduce this.

Let me know what is needed.

2 Likes

Thanks for flagging this! Would you be able to share a sample prompt so I can repro it on my end?

1 Like

Hi Vishal, sure thing, I’ve created a gist for this here:

Thank you! I’ll take a look and file it with Eng.

Surprisingly, I get a proper response when I set json_mode to False, so this is definitely a JSON mode issue. Filed this with Eng!

1 Like

Thanks, Vishal, I just tested it again today and noticed that the issue is still there.
Is there any update from Eng?

It’s still happening to me, too. It’s so persistent I had to switch models on my side.

Yeah, it’s a real issue.
Without schema validation, models like Flash and Flash 8b regularly respond with broken JSON, even with enabled JSON mode.
With schema validation, it completely breaks for all models.

We can’t run production workloads like this.

1 Like

Update: I’ve just switched to the new google-genai python SDK and everything worked immediately.

This is sufficient for me and I can consider this issue closed.
The fix is to use the new SDK.