I have been encountering a critical and consistent bug when using Claude models (including claude-opus-4-6-thinking) through your service.
Issue description
Every single time the assistant finishes its thinking process, the request fails with:
HTTP 400 Bad Request
Error message:
messages.XX: The final block in an assistant message cannot be `thinking`.
(type: invalid_request_error)
This error has appeared repeatedly on different message indices (e.g. messages.19, messages.37, etc.). The backend is incorrectly constructing the assistant message by ending it with a thinking block, which violates Anthropic’s API rules.
Impact
-
I have received zero successful outputs for any task.
-
I have completely wasted an entire week’s quota/credits.
-
The service is currently unusable for any work that involves thinking mode.
Reproduction steps
-
Start a new conversation with a Claude model that supports thinking.
-
Send any prompt that triggers the assistant to think.
-
As soon as thinking completes, the request fails with the above 400 error.
-
This happens 100% of the time.
Additional details
-
Error type: INVALID_ARGUMENT
-
Server: ESF (Google)
-
Example TraceIDs (for your internal debugging):
-
0x298db6
-
0x93c686
-
This looks like a backend message formatting bug in how Google wraps or passes the thinking block to the Claude API. It should be a quick fix on your side (just ensure the final block of any assistant message is never thinking).
Could you please investigate and deploy a fix as soon as possible? This is completely blocking my work.
Thank you!
