finishReason STOP but parts is missing inside candidate

I’m experiencing a persistent issue with Gemini 2.5-Pro where the API returns HTTP 200 OK responses with finishReason: "STOP" but the content.parts array is completely missing, resulting in no usable output.

Problem Details:

  • Model: gemini-2.5-pro
  • SDK: @google/genai v1.7.0
  • Frequency: Occurs very frequently (70-80% of requests) for almost 2 weeks now
  • Context: Multi-modal requests with documents + text prompts

Example Response:

{“candidates”:[{“content”:{“role”:“model”},“finishReason”:“STOP”,“index”:0,“safetyRatings”:[{“category”:“HARM_CATEGORY_SEXUALLY_EXPLICIT”,“probability”:“NEGLIGIBLE”},{“category”:“HARM_CATEGORY_HATE_SPEECH”,“probability”:“NEGLIGIBLE”},{“category”:“HARM_CATEGORY_HARASSMENT”,“probability”:“NEGLIGIBLE”},{“category”:“HARM_CATEGORY_DANGEROUS_CONTENT”,“probability”:“NEGLIGIBLE”}]}],“modelVersion”:“gemini-2.5-pro”,“usageMetadata”:{“promptTokenCount”:12875,“totalTokenCount”:12888,“promptTokensDetails”:[{“modality”:“TEXT”,“tokenCount”:5651},{“modality”:“DOCUMENT”,“tokenCount”:7224}],“thoughtsTokenCount”:13}}

Observations:

  1. Safety ratings are all “NEGLIGIBLE” - so it’s not a content filtering issue
  2. finishReason is “STOP” - indicating normal completion, not truncation
  3. Token usage looks normal - around 12k tokens, well within limits
  4. thoughtsTokenCount present - model is “thinking” but not outputting, also it’s very low
  5. Same prompt works occasionally - suggesting intermittent issue, not prompt problem

What I’ve tried:

  • Disabled all safety settings (BLOCK_NONE)
  • Adjusted thinkingConfig with different thinkingBudget values (0, -1, 1000)
  • Modified generation parameters (temperature, topP, topK)
  • Set explicit maxOutputTokens
  • Tested same prompts in Google AI Studio (works inconsistently there too giving me “You’ve rached your rate limit. Please try again later.” even if I’m on Paid Tier 1. This happens only sometimes)

Request Configuration:

temperature: 0.3,
topP: 0.95,
topK: 40,
candidateCount: 1,
safetySettings: [/* all set to BLOCK_NONE */]

This issue has been happening consistently for about 2 weeks across different types of content and different prompts. The same prompts work fine with gemini-2.5-flash, but we need the reasoning capabilities of the Pro model.

Is this a known issue with Gemini 2.5-Pro? Are there any recommended workarounds or configurations that might help ensure consistent content generation?

UPDATE:
From the latest posts on this forum, it’s clear that this is a widespread issue—and one that keeps getting worse by the day. I want to stress that, like many others, we rely on production software that depends on AI models such as 2.5 Pro. It’s frankly unacceptable that Google’s most stable model has been inconsistent for weeks, when issues of this kind should be resolved within hours at most. The situation is directly affecting our paying customers, who are unable to access the services they’ve purchased.

4 Likes

Same here. No changes to software or any other related products and settings. Just sopped returning “parts” all of a sudden.

1 Like

My solution was to migrate everything to the Vertex API. So far, it appears to be functioning correctly. The migration process was quite annoying as Vertex APIs don’t support File APIs so I’ve implemented Google Cloud Storage.

i have been trying to migrate for like 2 days its very frustrating

Hi @Mohamed_Amine @Kirill_Tiutiunnyk @frmusso,

Apologies for the delay in my response.

Just wanted to check if you all are still facing this issue?

I migrated to Vertex AI and everything’s working fine since then. I’m not sure if the problem still persist on Gemini APIs.

Thank you

1 Like

Hello,

I’m still encountering this issue. A few days ago, I started noticing empty content, and it’s been increasing in frequency. (Only on gemini-2.5-x)

I suspect that the likelihood of encountering empty content is higher when the input content is “large” (still not even half of the model context window).

Here’s an example of a response I receive 50% of the time (as I retry when it happens, see cachedContent).

{
  "candidates": [
    {
      "content": {
        "role": "model"
      },
      "finishReason": "STOP"
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 28611,
    "totalTokenCount": 28611,
    "cachedContentTokenCount": 28271,
    "promptTokensDetails": [
      {
        "modality": "TEXT",
        "tokenCount": 28611
      }
    ],
    "cacheTokensDetails": [
      {
        "modality": "TEXT",
        "tokenCount": 28271
      }
    ]
  },
  "turnToken": "v1_Chd***"
}

I’m also having this issue. Here is an example raw response:
{
“candidates”: [
{
“index”: 0,
“content”: {
“parts”: [
{
“text”: “{\n "questions": [\n "question_number"\n ]\n}”
}
],
“role”: “model”
},
“finishReason”: “STOP”
}
],
“usageMetadata”: {
“candidatesTokenCount”: 19,
“totalTokenCount”: 13524,
“promptTokensDetails”: [
{
“tokenCount”: 2655,
“modality”: “TEXT”
},
{
“modality”: “DOCUMENT”,
“tokenCount”: 6192
}
],
“thoughtsTokenCount”: 4658,
“promptTokenCount”: 8847
},
“responseId”: “ynhJaey0BaHVz7IPyPWCeA”,
“modelVersion”: “gemini-2.5-pro”
}

Note the candidatesTokenCount is 19 - that is the length of the response in tokens, and I had another identical response length in a previous error. Suspicious that it’s stopping at 19 tokens in both cases…
Switching to VertexAI isn’t an option for me, so this is not a valid solution.

same here, been having thousands of content with no parts returned over the past few days, on gemini2.5-flash.

One thing we noticed, is that all the failed requests seemed to have triggered a grounding search, while the prompts that don’t trigger grouding seem to be working fine.

so it seems like grounding for gemini 2.5 flash has been broken for a few days, and nobody at google noticed??

Hey folks, sorry about the issues here.

@stephanevdj - would you be able to share a grounding query that’s reproducing this issue? I ran a few on my end and wasn’t able to reproduce this.

@Ian_Dunning - are you using structured outputs or function calling in your request? If so, would you be able to share a sample request?

You said only on Gemini 2.5, so it is not present on 2.0 and 3.0 models?

Hello,

At that time models 3.0 were not released yet.

And I did not observe similar issue with models 2.0.

I’m a bit surprised that this is still an issue, because I actually managed to get an answer about it from the web version of Gemini.

I accidentally sent this ‘STOP empty Response’ error to the web version (I was actually asking about something else at the time :sweat_smile: ), and Gemini explained that this is a normal occurrence for thinking models. The explanation given was that the reasoning process took too long and was interrupted.

At the same time, it provided me with a solution: when encountering a situation where there are thoughtTokens but no response, you simply need to provide a user input to ‘push’ the bot to generate the final answer (by instructing the bot to ‘continue’ or ‘generate response based on thoughts’).

After I modified my API workflow according to this method, I never had my process interrupted by this issue again.