I have confirmed an issue when enabling a JSON schema, which results in the underlying model Pro/Flash producing repeating tokens until the max output tokens are reached.
I’m using the Python SDK (0.8.3) and a simple schema:
class SingleMatchVerificationResult(TypedDict):
conclusion: bool
confidence: int
reason: str
confidence_reason: str
that eventually is passed directly to:
response = client.generate_content(
prompt,
generation_config=genai.types.GenerationConfig(
candidate_count=1,
max_output_tokens=512,
temperature=0,
response_mime_type="application/json" if json_mode else "text/plain",
response_schema=json_schema if json_mode else None
)
)
Here is a sample response that shows it repeating itself.
Yeah, it’s a real issue.
Without schema validation, models like Flash and Flash 8b regularly respond with broken JSON, even with enabled JSON mode.
With schema validation, it completely breaks for all models.