Observed Behavior
- Gemini 3 Pro: Unsafe prompts are blocked at the API level
(finishReason: SAFETY), no tokens consumed. - Gemini 3.1 Pro: Unsafe prompts are NOT blocked. Instead, a hidden
safety prompt is injected into the context. The model runs full
inference (including thinking) and generates a natural-language
refusal. All tokens (input + injected safety prompt + thinking +
output) are billed to the user.
Why This Is a Problem
- Cost: Users pay for inference that produces zero value.
- Transparency: The injected safety prompt is invisible to users.
There is no metadata indicating the request was safety-filtered. - Fairness: In the 3 Pro model, blocked requests cost nothing.
The 3.1 Pro approach silently shifts safety infrastructure costs
to the user’s token budget.
Suggested Fix
- Option A: Restore hard-blocking for clearly unsafe prompts
(no token consumption). - Option B: If soft-steering is preferred, add a response field like
safety_prompt_injected: trueand DO NOT bill tokens for
safety-redirected responses. - Option C: At minimum, make the injected safety prompt visible in
the API response metadata so users can audit their costs.