I have confirmed an issue when enabling a JSON schema, which results in the underlying model Pro/Flash producing repeating tokens until the max output tokens are reached.
I’m using the Python SDK (0.8.3) and a simple schema:
class SingleMatchVerificationResult(TypedDict):
conclusion: bool
confidence: int
reason: str
confidence_reason: str
that eventually is passed directly to:
response = client.generate_content(
prompt,
generation_config=genai.types.GenerationConfig(
candidate_count=1,
max_output_tokens=512,
temperature=0,
response_mime_type="application/json" if json_mode else "text/plain",
response_schema=json_schema if json_mode else None
)
)
Here is a sample response that shows it repeating itself.
I can share plenty of prompts and provide code that will help reproduce this.
Let me know what is needed.
