Gemini 2.5 Pro with empty response.text

Now the disaster continues…

One finding is that the problem intensifies, appears, or recovers after the limit reset time (12:00 PM Pacific Time). This is the same for people in different regions. It always triggers at 12:00 PM Pacific Time.

I’m providing a crucial update to my previous posts regarding persistent issues with gemini-2.5-pro. My apologies for any confusion in my earlier descriptions; with new, precise logs, the situation is now unequivocally clear and represents a critical regression.

Original Problem (confirmed still active):
When attempting to use gemini-2.5-pro with a system_instruction parameter (even a very short one like “Ассистент.txt” - 126 characters, as shown in the log below), the model consistently returns a GenerateContentResponse object where:

  • finish_reason is "STOP".
  • The content for the candidate is present ("content": { "role": "model" }), but it contains no parts array with actual generated text.
    This is the original “empty response” bug that I initially reported.

Workaround Failure (new critical issue):
My application previously employed a workaround for this bug: injecting the system prompt directly into the contents (message history) as a dummy user/model message pair, thus bypassing the system_instruction parameter. This workaround was effective for gemini-2.5-pro until recently.

However, this workaround now also results in the exact same “empty response” behavior: finish_reason: "STOP" with no content.parts in the GenerateContentResponse. This indicates that a change on Google’s side has effectively neutralized the workaround.

Summary of gemini-2.5-pro’s current state (for my setup):

  1. Direct use of system_instruction: Leads to an empty response (original bug, confirmed not fixed).
  2. Workaround (injecting system prompt into contents): Now also leads to an empty response (new regression / workaround neutralization).
  3. Result: There is currently no functional method to reliably provide gemini-2.5-pro with a system persona or initial context and receive any meaningful text response. This renders gemini-2.5-pro completely unusable for conversational AI tasks requiring a system prompt.
  4. Model Specificity Confirmed: gemini-2.5-flash continues to work perfectly with identical application logic and prompt structures, confirming the issue is specific to gemini-2.5-pro.

This is a critical blocking regression for applications relying on gemini-2.5-pro with system prompts.

Could the Google AI team please urgently investigate this? We need a clear, reliable, and functional method to provide system instructions to gemini-2.5-pro. Is this behavior intentional, or is it a known service degradation? If intentional, what is the new recommended pattern for robust system prompting?

Here is a log from a run with gemini-2.5-pro using a short system prompt (“Ассистент.txt”, 126 characters) and with the workaround DISABLED, clearly showing the finish_reason: "STOP" with no generated parts:

--- DEBUG: Starting streaming response for Gemini model 'gemini-2.5-pro' ---
DEBUG: GeminiProvider.get_chat_response_stream: Config: {
  "api_key": "XXXXXXXXXXXXXXXXXXXX",
  "model": "gemini-2.5-pro",
  "temperature": 1.0,
  "top_p": 0.95,
  "selected_prompt": "\u0410\u0441\u0441\u0438\u0441\u0442\u0435\u043d\u0442.txt"
}
DEBUG: GeminiProvider.get_chat_response_stream: Raw messages count: 1
DEBUG: GeminiProvider.get_chat_response_stream: System Prompt: 'Ты умный и честный ассистент.
Отвечай чётко, подробно и по теме.
Отвечай только на основе фактов.
Не выдумывай ничего от себя....' (Length: 126)
DEBUG: GeminiProvider._prepare_messages: Preparing 1 raw messages for Gemini API.
DEBUG: GeminiProvider._prepare_messages: Message 0 part 0 is text.
DEBUG: GeminiProvider._prepare_messages: Finished. Prepared 1 messages for Gemini API.
DEBUG: GeminiProvider.get_chat_response_stream: Prepared messages count: 1. First message: {
  "role": "user",
  "parts": [
    "\u041f\u0440\u0438\u0432\u0435\u0442!"
  ]
}
DEBUG: GeminiProvider.get_chat_response_stream: Generation Config: GenerationConfig(candidate_count=None, stop_sequences=None, max_output_tokens=None, temperature=1.0, top_p=0.95, top_k=None, response_mime_type=None, response_schema=None, presence_penalty=None, frequency_penalty=None)
DEBUG: GeminiProvider.get_chat_response_stream: Safety Settings: {'HARM_CATEGORY_HARASSMENT': 'BLOCK_NONE', 'HARM_CATEGORY_HATE_SPEECH': 'BLOCK_NONE', 'HARM_CATEGORY_SEXUALLY_EXPLICIT': 'BLOCK_NONE', 'HARM_CATEGORY_DANGEROUS_CONTENT': 'BLOCK_NONE'}
DEBUG: GeminiProvider.get_chat_response_stream: Calling model.generate_content(stream=True)...
DEBUG: GeminiProvider.get_chat_response_stream: generate_content (streaming) call initiated. Iterating through chunks...
DEBUG: GeminiProvider.get_chat_response_stream: Received raw chunk 0: response:
GenerateContentResponse(
    done=True,
    iterator=None,
    result=protos.GenerateContentResponse({
      "candidates": [
        {
          "content": {
            "role": "model"
          },
          "finish_reason": "STOP",
          "index": 0
        }
      ],
      "usage_metadata": {
        "prompt_token_count": 46,
        "total_token_count": 46
      },
      "model_version": "gemini-2.5-pro"
    }),
)
DEBUG: GeminiProvider.get_chat_response_stream: Chunk 0 has no parts. Skipping. Full chunk: response:...
DEBUG: GeminiProvider.get_chat_response_stream: Chunk 0 candidate finish reason: STOP
DEBUG: GeminiProvider.get_chat_response_stream: Chunk 0 candidate safety ratings: []
--- ПРЕДУПРЕЖДЕНИЕ GEMINI: Модель не дала ответ. Причина завершения: STOP (обработано). ---
DEBUG: GeminiProvider.get_chat_response_stream: Streaming response iteration completed.
--- DEBUG: Finished streaming response for Gemini model 'gemini-2.5-pro' ---
Попытка сгенерировать заголовок с помощью модели: gemini-2.5-flash-lite
DEBUG: GeminiProvider.__init__: Gemini API configured successfully.

--- DEBUG: Starting non-streaming response for Gemini model 'gemini-2.5-flash-lite' ---
DEBUG: GeminiProvider.get_chat_response: Config: {
  "model": "gemini-2.5-flash-lite",
  "temperature": 1.0,
  "top_p": 0.95,
  "max_output_tokens": 150
}
DEBUG: GeminiProvider.get_chat_response: Raw messages count: 1
DEBUG: GeminiProvider.get_chat_response: System Prompt: None
DEBUG: GeminiProvider._prepare_messages: Preparing 1 raw messages for Gemini API.
DEBUG: GeminiProvider._prepare_messages: Message 0 part 0 is text.
DEBUG: GeminiProvider._prepare_messages: Finished. Prepared 1 messages for Gemini API.
DEBUG: GeminiProvider.get_chat_response: Prepared messages count: 1. First message: {
  "role": "user",
  "parts": [
    "\u0421\u043e\u0437\u0434\u0430\u0439 \u043e\u0447\u0435\u043d\u044c \u043a\u043e\u0440\u043e\u0442\u043a\u043e\u0435, \u043b\u0430\u043a\u043e\u043d\u0438\u0447\u043d\u043e\u0435 \u043d\u0430\u0437\u0432\u0430\u043d\u0438\u0435 \u0434\u043b\u044f \u0447\u0430\u0442\u0430 (3-5 \u0441\u043b\u043e\u0432), \u043e\u0441\u043d\u043e\u0432\u0430\u043d\u043d\u043e\u0435 \u043d\u0430 \u0441\u043b\u0435\u0434\u0443\u044e\u0449\u0435\u043c \u0437\u0430\u043f\u0440\u043e\u0441\u0435 \u043f\u043e\u043b\u044c\u0437\u043e\u0432\u0430\u0442\u0435\u043b\u044f. \u041e\u0442\u0432\u0435\u0442\u044c \u0422\u041e\u041b\u042C\u041a\u041e \u043d\u0430\u0437\u0432\u0430\u043d\u0438\u0435\u043c, \u0431\u0435\u0437 \u043a\u0430\u0432\u044b\u0447\u0435\u043a \u0438 \u043b\u0438\u0448\u043d\u0438\u0445 \u0441\u043b\u043e\u0432.\n\n\u0417\u0410\u041f\u0420\u041e\u0421 \u041f\u041e\u041b\u042C\u0417\u041e\u0412\u0410\u0422\u0415\u041b\u042F: \"\u041f\u0440\u0438\u0432\u0435\u0442!\"\n\u0422\u0412\u041e\u0419 \u041e\u0422\u0412\u0415\u0422:"
  ]
}
DEBUG: GeminiProvider.get_chat_response: Generation Config: GenerationConfig(candidate_count=None, stop_sequences=None, max_output_tokens=None, temperature=1.0, top_p=0.95, top_k=None, response_mime_type=None, response_schema=None, presence_penalty=None, frequency_penalty=None)
DEBUG: GeminiProvider.get_chat_response: Safety Settings: {'HARM_CATEGORY_HARASSMENT': 'BLOCK_NONE', 'HARM_CATEGORY_HATE_SPEECH': 'BLOCK_NONE', 'HARM_CATEGORY_SEXUALLY_EXPLICIT': 'BLOCK_NONE', 'HARM_CATEGORY_DANGEROUS_CONTENT': 'BLOCK_NONE'}
DEBUG: GeminiProvider.get_chat_response: Request Options: {'timeout': 120}
DEBUG: GeminiProvider.get_chat_response: Calling model.generate_content(stream=False)...
DEBUG: GeminiProvider.get_chat_response: generate_content (non-streaming) call completed. Full response object: response:
GenerateContentResponse(
    done=True,
    iterator=None,
    result=protos.GenerateContentResponse({
      "candidates": [
        {
          "content": {
            "parts": [
              {
                "text": "\u041f\u0440\u0438\u0432\u0435\u0442\u0441\u0442\u0432\u0438\u0435"
              }
            ],
            "role": "model"
          },
          "finish_reason": "STOP",
          "index": 0
        }
      ],
      "usage_metadata": {
        "prompt_token_count": 70,
        "candidates_token_count": 2,
        "total_token_count": 72
      },
      "model_version": "gemini-2.5-flash-lite"
    }),
)
DEBUG: GeminiProvider.get_chat_response: Received non-empty response. Text length: 11. Full response text: 'Приветствие...'
--- DEBUG: Finished non-streaming response for Gemini model 'gemini-2.5-flash-lite' ---
Заголовок успешно сгенерирован: 'Приветствие'
Chat saved to D:\DATA\FletChat Development V2 (New TTS)\chats\Приветствие.json

Here is the latest log, showing gemini-2.5-pro failing with an immediate StopIteration even when system_prompt is explicitly None:



--- DEBUG: Starting streaming response for Gemini model 'gemini-2.5-pro' ---
DEBUG: GeminiProvider.get_chat_response_stream: Config: {
  "api_key": "ХХХХХХХХХ",
  "model": "gemini-2.5-pro",
  "temperature": 1.0,
  "top_p": 0.95,
  "selected_prompt": "\u0414\u0430\u0448\u0430.txt"
}
DEBUG: GeminiProvider.get_chat_response_stream: Raw messages count: 1
DEBUG: GeminiProvider.get_chat_response_stream: System Prompt: None
DEBUG: GeminiProvider._prepare_messages: Preparing 1 raw messages for Gemini API.
DEBUG: GeminiProvider._prepare_messages: Message 0 part 0 is text.
DEBUG: GeminiProvider._prepare_messages: Finished. Prepared 1 messages for Gemini API.
DEBUG: GeminiProvider.get_chat_response_stream: Prepared messages count: 1. First message: {
  "role": "user",
  "parts": [
    "\u041f\u0440\u0438\u0432\u0435\u0442!"
  ]
}
DEBUG: GeminiProvider.get_chat_response_stream: Generation Config: GenerationConfig(candidate_count=None, stop_sequences=None, max_output_tokens=None, temperature=1.0, top_p=0.95, top_k=None, response_mime_type=None, response_schema=None, presence_penalty=None, frequency_penalty=None)
DEBUG: GeminiProvider.get_chat_response_stream: Safety Settings: {'HARM_CATEGORY_HARASSMENT': 'BLOCK_NONE', 'HARM_CATEGORY_HATE_SPEECH': 'BLOCK_NONE', 'HARM_CATEGORY_SEXUALLY_EXPLICIT': 'BLOCK_NONE', 'HARM_CATEGORY_DANGEROUS_CONTENT': 'BLOCK_NONE'}
DEBUG: GeminiProvider.get_chat_response_stream: Calling model.generate_content(stream=True)...
--- КРИТИЧЕСКАЯ ОШИБКА ВНУТРИ GEMINI API CLIENT (streaming) ---
Traceback (most recent call last):
  File "D:\DATA\FletChat Development V2 (New TTS)\ai_client.py", line 130, in get_chat_response_stream
    response_stream = model.generate_content(
                    ^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\DATA\FletChat Development V2 (New TTS)\python_embeded\Lib\site-packages\google\generativeai\generative_models.py", line 329, in generate_content
    return generation_types.GenerateContentResponse.from_iterator(iterator)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\DATA\FletChat Development V2 (New TTS)\python_embeded\Lib\site-packages\google\generativeai\types\generation_types.py", line 634, in from_iterator
    response = next(iterator)
               ^^^^^^^^^^^^^^
  File "D:\DATA\FletChat Development V2 (New TTS)\python_embeded\Lib\site-packages\google\api_core\grpc_helpers.py", line 116, in __next__
    return next(self._wrapped)
           ^^^^^^^^^^^^^^^^^^^
  File "D:\DATA\FletChat Development V2 (New TTS)\python_embeded\Lib\site-packages\grpc\_channel.py", line 543, in __next__
    return self._next()
           ^^^^^^^^^^^^
  File "D:\DATA\FletChat Development V2 (New TTS)\python_embeded\Lib\site-packages\grpc\_channel.py", line 950, in _next
    raise StopIteration()
StopIteration
Тип ошибки: StopIteration, Сообщение:
--- DEBUG: Finished streaming response for Gemini model 'gemini-2.5-pro' ---
Ошибка провайдера: Ошибка Gemini API (стриминг):

I kindly ask Google representatives to answer.

What is the specific reason why the model does not work?
Should we expect improvements or look for a replacement and revise our implementation plans?

2 Likes

We’re seeing this constantly too, I had to put a retry loop around every API call that checks if the response is empty and keeps resubmitting the request until it isn’t. This is very clearly broken and really needs more attention.

4 Likes

I have a feeling they’re only doing this for free tier accounts.

The situation is exactly the same for projects linked to billing.

It seems like the problem is completely random - everyone is suggesting different solutions from trimming system prompts, disabling function calls, et cetera.

The only guaranteed-to-work option here is to migrate to Vertex AI or things like OpenRouter - we did and are now getting not only zero errors, but also higher TPS on the model.

Having this problem for about 2 months. Moving to Vertex API fixed the problem even though had to rewrite my codebase as vertexAI wont support some functions like file API.

Yep, I can confirm this too

Gemini 2.5 pro model is returning None,
this issue used to occur earlier but within 3 retires the model would return the output but today I have noticed that model is returning None consistently even after several retires (5-6 times)
model usage {‘prompt_token_count’: 17083, ‘candidates_token_count’: None, ‘total_token_count’: 17157}
[get_niosh_inputs] attempt 1: got non‐string raw=None, retrying…
video mime type: video/mp4
model usage {‘prompt_token_count’: 17083, ‘candidates_token_count’: None, ‘total_token_count’: 17161}
attempt 2: got non‐string raw=None, retrying…