I got successful responses a couple of times before with the same prompt. However, it’s now returning empty responses, and these are the finish_reason and safety_ratings from the usage_metadata:
Finish Reason: STOP. Safety Ratings: N/A
Therefore, it doesn’t seem to be a problem with max tokens, nor safety reasons.
I already implemented retry with exponential backoff, rotating among different prefixes in my prompt according to a solution suggested here, but nothing is working.
Would appreciate if anyone has any guidance on this! If you need more snippets of my code, I can share it .
I don’t think using grounded search is the culprit. The issue increased in the last weeks but today and yesterday it went insane. I use the same prompt every day, with variations, for testing a lot. Other forums also report that today and yesterday were really bad. None of these reports call out using grounded search, which is off by default.
I have also started running into this as of three days ago, even when running low-token prompts just using curl directly. A response from Gemini is hit-or miss, it either responds as expected, gives an error 500, or gives a STOP without any content. Could it be overloaded?
Been using the same 2.5 Pro setup for weeks with no issues, but now I keep getting empty responses with finish_reason: stop . My retry loop fails more often than not, even after adding exponential backoff. Tried using unique identifiers (timestamps, request IDs, session IDs) and the Uniueq fix — neither worked. Still burning money on failed calls.
class BaseLLMClient(ABC):
"""Abstract Base Class defining the interface for all LLM clients."""
def __init__(self, model: str):
self.model = model
@abstractmethod
def generate(
self,
system_prompt: str,
user_prompt: str,
max_output_tokens: int
) -> LLMResponse:
"""Generates a response from the LLM."""
pass
def generate(self, system_prompt, user_prompt, max_output_tokens) -> LLMResponse:
# Lógica de retry que você já tinha... (simplificada aqui para clareza)
try:
response = self.client.generate_content(
contents=[user_prompt],
generation_config=genai.GenerationConfig(
max_output_tokens=max_output_tokens,
),
system_instruction=system_prompt,
)
# Traduz a resposta específica do Gemini para o nosso formato padrão
output_text = response.text
finish_reason = response.candidates[0].finish_reason.name if response.candidates else "UNKNOWN"
if not output_text: # Tratamento de resposta vazia
raise LLMClientError(
f"Empty response from Gemini. Finish Reason: {finish_reason}")
return LLMResponse(
text=output_text,
model_name=self.model,
input_tokens=response.usage_metadata.prompt_token_count,
output_tokens=response.usage_metadata.candidates_token_count,
finish_reason=finish_reason,
)
except Exception as e:
raise LLMClientError(f"Gemini API error: {e}")
def get_llm_client(model_name: str) -> BaseLLMClient:
"""
Factory function to get the appropriate LLM client based on the model name.
"""
if model_name.startswith("gemini"):
return GeminiClient(model=model_name)
elif model_name.startswith("gpt"):
return OpenAIClient(model=model_name)
else:
raise ValueError(f"Unsupported model provider for model: {model_name}")
class LLMClientError(Exception):
"""Custom exception for all LLM client errors."""
pass
Thank you for reporting this. The engineering team is aware of the empty response issue and is actively working on a fix. We will keep you updated as the fix is identified and rolled out.
This issue has been happening for at least 2 weeks it’s literally unacceptable if this issue keeps happening im thinking about switching to other ai providers to be honest i cant keep all my production apps closed if google is not gonna do anything about it
I don’t know about your use case, for my use case, Google model is the best, there is no one providing to the accuracy at the Google’s level. I have been running for 50 days, for 35 days, not even a single issue from Google. It is just this 2 weeks where it went nuts.
I just found this post and a few others. It looks like I am having the same issue with 2.5 Pro, good to know it’s not just me, bad to know it’s an issue Gemini I have posted here , looks like the same issue Gemini 2.5 Pro - Empty Response-Status 200
Same issue. The error is so frustrating. The first time it happened was yesterday, and now almost all my bots are returning empty responses after 1-7 prompts in JSON mode (Flash and Flash-mini) almost instantly, but Pro the least.
Error
"[gemini-2.5-pro] Model failed:" "Received empty response from the model."
Warning
"[SYSTEM] Model gemini-2.5-pro failed. Marking as dead and trying next."
Error
"[gemini-2.5-flash] Model failed:" "Received empty response from the model."
Warning
"[SYSTEM] Model gemini-2.5-flash failed. Marking as dead and trying next."
Error
"[FATAL] All models failed after exhausting fallbacks."
Error
"Polling error:" "All available models are failing."
Error
"Polling error:" "Bad Request: message is not modified: specified new message content and reply markup are exactly the same as a current content and reply markup of the message"
Error
"Polling error:" "Bad Request: message is not modified: specified new message content and reply markup are exactly the same as a current content and reply markup of the message"
Error
"Polling error:" "Bad Request: query is too old and response timeout expired or query ID is invalid"
Error
"Polling error:" "Bad Request: query is too old and response timeout expired or query ID is invalid"
Error
"Polling error:" "Bad Request: query is too old and response timeout expired or query ID is invalid"
Error
"Polling error:" "Bad Request: query is too old and response timeout expired or query ID is invalid"