Bug: HTTP 400 "The final block in an assistant message cannot be thinking" – Claude models fail every time thinking ends (quota completely wasted)

I have been encountering a critical and consistent bug when using Claude models (including claude-opus-4-6-thinking) through your service.

Issue description
Every single time the assistant finishes its thinking process, the request fails with:

HTTP 400 Bad Request
Error message:

messages.XX: The final block in an assistant message cannot be `thinking`.

(type: invalid_request_error)

This error has appeared repeatedly on different message indices (e.g. messages.19, messages.37, etc.). The backend is incorrectly constructing the assistant message by ending it with a thinking block, which violates Anthropic’s API rules.

Impact

  • I have received zero successful outputs for any task.

  • I have completely wasted an entire week’s quota/credits.

  • The service is currently unusable for any work that involves thinking mode.

Reproduction steps

  1. Start a new conversation with a Claude model that supports thinking.

  2. Send any prompt that triggers the assistant to think.

  3. As soon as thinking completes, the request fails with the above 400 error.

  4. This happens 100% of the time.

Additional details

  • Error type: INVALID_ARGUMENT

  • Server: ESF (Google)

  • Example TraceIDs (for your internal debugging):

    • 0x298db6

    • 0x93c686

This looks like a backend message formatting bug in how Google wraps or passes the thinking block to the Claude API. It should be a quick fix on your side (just ensure the final block of any assistant message is never thinking).

Could you please investigate and deploy a fix as soon as possible? This is completely blocking my work.

Thank you!