Gemini 2.5 API bug: missing finishReason when max token limit is reached

To replicate:

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro-exp-03-25:generateContent?key=MY_API_KEY" \
  -H 'Content-Type: application/json' \
  -X POST \
  -d '{
    "contents": [
      {
        "parts": [
          {
            "text": "could"
          }
        ]
      }
    ],
    "generationConfig": {
      "temperature": 0.1,
      "maxOutputTokens": 411
    }
  }'

Response:

{
  "usageMetadata": {
    "promptTokenCount": 1,
    "totalTokenCount": 1,
    "promptTokensDetails": [
      {
        "modality": "TEXT",
        "tokenCount": 1
      }
    ]
  },
  "modelVersion": "gemini-2.5-pro-exp-03-25"
}

This is not the same behavior as you get from Gemini 2.0 Flash:

{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": "Could you please provide more context? I need to know what you want me to do with the word \"could\". For example, are you asking me to:\n\n* **Define it?** (e.g., \"Could you define 'could'?\")\n* **Use it in a sentence?"
          }
        ],
        "role": "model"
      },
      "finishReason": "MAX_TOKENS",
      "avgLogprobs": -0.13195424001724992
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 1,
    "candidatesTokenCount": 61,
    "totalTokenCount": 62,
    "promptTokensDetails": [
      {
        "modality": "TEXT",
        "tokenCount": 1
      }
    ],
    "candidatesTokensDetails": [
      {
        "modality": "TEXT",
        "tokenCount": 61
      }
    ]
  },
  "modelVersion": "gemini-2.0-flash"
}

With 2.0 Flash, we get a candidate with "finishReason": "MAX_TOKENS" allowing us to determine why we got a truncated/missing response. This is easy to parse.

If I increase the maxOutputTokens value a little bit more, Gemini 2.5 Pro will respond correctly, but truncate the response after just a few tokens. I’m guessing it spends a few hundred tokens on CoT / reasoning before generating response tokens, and if the model hits the token limit during this thinking phase, it fails to generate a candidate and spits out the unparseable response.

Expected behavior would be something like a candidate with an empty content object and a finishReason of MAX_TOKENS.