Bug: Trajectory permanently poisoned by thinking-terminated assistant turn after stream timeout — every subsequent Claude request fails with HTTP 400

Davis_Lin · May 11, 2026, 4:21am

Tags: bug, api, models, claude, thinking

Severity

P1 — Blocker. A single interrupted generation permanently breaks the trajectory. The user cannot continue without manually reverting history, and quota is consumed on every failed retry.

Summary

When a Claude (extended-thinking) generation in Antigravity is interrupted by the upstream Vertex AI stream timeout — consistently observed around the 120-second mark, during the final output phase — the assistant turn is persisted to the trajectory with a thinking content block as its final block. On the next user message, the malformed history is replayed to Anthropic’s API, which rejects it per its public message-format contract:

messages.NN: The final block in an assistant message cannot be `thinking`.

This 400 is not transient. Every subsequent turn in the same trajectory fails identically. Switching models, rewording the follow-up prompt, and clicking Retry all reproduce the failure because the poisoned assistant turn remains in history.

Reproduction (100% reliable)

Open a new trajectory with any thinking-enabled Claude model (e.g. claude-sonnet-4-6-thinking, claude-opus-4-6-thinking).
Send a prompt that triggers a long generation (large file write, multi-file refactor) such that the assistant runs longer than ~120 seconds.
Observe the stream drop during the final output phase, after thinking has completed.
Send any follow-up message in the same trajectory.
Request fails with HTTP 400 / INVALID_ARGUMENT. Every further turn fails the same way.

Sample failure

Trajectory ID: `b3185721-c22f-49a0-bcc3-72ffb0bc
Trace ID: `0x8c48d7a167d61
Anthropic request ID: req_vrtx_011CasDFkGkYSMVhWAh5iNhN
Timestamp: Sat, 09 May 2026 15:13:31 GMT
Failing message index: messages.33
Server: ESF (Google)

{
  "error": {
    "code": 400,
    "message": "{\"type\":\"error\",\"error\":{\"type\":\"invalid_request_error\",\"message\":\"messages.33: The final block in an assistant message cannot be `thinking`.\"},\"request_id\":\"req_vrtx_011CasDFkGkYSMVhWAh5iNhN\"}",
    "status": "INVALID_ARGUMENT"
  }
}

Root cause (hypothesis)

Antigravity persists assistant turns to the trajectory store before validating the message-block invariant required by the Anthropic Messages API: the terminal content block of every assistant message must be text or tool_use, and is explicitly forbidden from being thinking.

When the upstream Vertex AI stream is cut mid-generation — almost always between the thinking block and the trailing text block, since that gap is where most wall-clock time is spent on long outputs — the partial assistant turn is committed as-is, with thinking as its last block. On the next request, the full history is replayed to Anthropic, which correctly rejects it.

The bug is in Antigravity’s serialization/persistence layer, not in the model, the Anthropic API, or Vertex AI’s transport. Once the malformed turn is in the trajectory, the trajectory is dead.

Suggested fix

On stream interruption, before committing the assistant turn to the trajectory store, take one of:

Strip any trailing thinking block from the turn.
Append a synthetic empty text block ({"type":"text","text":"[generation interrupted]"}) to satisfy the invariant.
Mark the turn as failed and exclude it from the next outbound request payload.

Validating message structure on the way into the trajectory store (rather than on the way out to Anthropic) eliminates the entire class of failures, including related ones listed below.

Related reports

Bug: HTTP 400 “The final block in an assistant message cannot be thinking” — Claude models fail every time thinking ends (Apr 18, 2026) — same root cause, additional trace IDs.
Bug: All Claude models fail with HTTP 400 — “assistant message prefill not supported” via Vertex AI — same backend serialization class.

Current workaround

Manually revert the trajectory to a checkpoint before the interrupted turn, or abandon the trajectory and start a new one. Both options discard accumulated context and waste quota already spent on the broken turn.

Impact

Every long-running task on a thinking-enabled Claude model is at risk of permanently poisoning its trajectory.
Once poisoned, the trajectory consumes quota on every retry without producing output.
The bug is deterministic above ~120 s of generation time, making Antigravity unusable for the workloads thinking models are most useful for (large refactors, long-form generation, multi-step agent runs).

Topic		Replies	Views
Bug: HTTP 400 "The final block in an assistant message cannot be thinking" – Claude models fail every time thinking ends (quota completely wasted) Google Antigravity bug , api , models	0	104	April 18, 2026
Bug: All Claude models fail with HTTP 400 — "assistant message prefill not supported" via Vertex AI Google Antigravity bug , api , models	10	704	May 11, 2026
[Bug] Thinking process hangs indefinitely / No response generated Google Antigravity bug , models	3	98	May 9, 2026
Antigravity: Model stuck in long 'thinking' phase without generating output Google Antigravity bug , models	1	61	May 9, 2026
[Bug/Feedback] Antigravity IDE lacks basic context compaction. Claude Opus 4.6 is virtually UNUSABLE due to constant "prompt is too long" agent terminations. Google Antigravity bug	3	424	March 4, 2026