Hi,
I am facing a critical issue where gemini-2.5-flash intermittently fails to identify/call tools and simply returns a “STOP” finish reason with zero output tokens.
The most frustrating part is the inconsistency: with the exact same query and system prompt, it works perfectly fine sometimes, but fails completely at other times.
messages = [
SystemMessage(content=self._get_system_prompt()),
HumanMessage(content=query),
]
# Using bind_tools to attach tools to the LLM
llm_result = self.llm_with_tools.invoke(messages)
{
"content": "",
"additional_kwargs": {},
"response_metadata": {
"prompt_feedback": {"block_reason": 0, "safety_ratings": []},
"finish_reason": "STOP",
"model_name": "gemini-2.5-flash",
"safety_ratings": []
},
"type": "ai",
"name": null,
"id": "run--[REDACTED]",
"example": false,
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 1440,
"output_tokens": 0,
"total_tokens": 1440
}
}
To be honest, I am a beginner and I have been struggling with this issue for several days. It is incredibly exhausting because I cannot find any clear reason why it works one moment and fails the next. I feel completely stuck and I am desperate to understand the cause. Any guidance or help would mean so much to me. Thank you.