Problem
We are currently running some early tests using gemini-2.5-flash-lite-preview-06-17, and running into some peculiar behavior when utilizing structured output for very simple toy examples.
Reproducible Example
We defined a simple response model as follows:
class GeminiResponse(BaseModel):
response: str = Field(description="The response from the Gemini model")
The user prompt is similarly basic:
user_prompt = "What is the capital of france?"
When we run the above using gemini-2.5-flash-lite-preview-06-17, the response hangs for ~1+ minute(s) before returning the following error message indication a structured output generation error (note we are utilizing pydanticAI as a wrapper here, calling the VertexAI endpoints for gemini, however I believe the error seems to be on the gemini-side of things):
UnexpectedModelBehavior: Content field missing from Gemini response, body:
candidates=[Candidate(content=Content(parts=None, role=None), citation_metadata=None, finish_message='Malformed function call: call:final_result{response:<ctrl46>The capital of France is', token_count=None, finish_reason=<FinishReason.MALFORMED_FUNCTION_CALL: 'MALFORMED_FUNCTION_CALL'>, url_context_metadata=None, avg_logprobs=None, grounding_metadata=None, index=None, logprobs_result=None, safety_ratings=None)] create_time=datetime.datetime(2025, 6, 20, 19, 1, 46, 250104, tzinfo=TzInfo(UTC)) response_id='GrBVaPihD-ijgLUPxIOokQs' model_version='gemini-2.5-flash-lite-preview-06-17' prompt_feedback=None usage_metadata=GenerateContentResponseUsageMetadata(cache_tokens_details=None, cached_content_token_count=None, candidates_token_count=None, candidates_tokens_details=None, prompt_token_count=27, prompt_tokens_details=[ModalityTokenCount(modality=<MediaModality.TEXT: 'TEXT'>, token_count=27)], thoughts_token_count=None, tool_use_prompt_token_count=None, tool_use_prompt_tokens_details=None, total_token_count=27, traffic_type=<TrafficType.ON_DEMAND: 'ON_DEMAND'>) automatic_function_calling_history=[] parsed=None
What is peculiar about this error is that it seems to be replicable on my end, and notable dissapears under the following conditions:
- Capitalizing ‘h’ in ‘What’ in the prompt to change it to “WHat is the capital of france?”
- Switching to “gemini-2.0-flash”
- Removed structured output generation as a requirement of the answer.
Upon doing so we recieve the expected structured output (shown as text):
'{"response":"The capital of France is Paris."}'
Question
In the meantime we are continuing to test and compare 2.5-flash-lite to 2.0-flash, but I was curious if there is any guidance on why such errors might be happening, even under simple circumstances, and/or what rules can be followed in more complex prompt design to avoid them, or if this is just a general bug?