Flash 2.5 (Vertex AI) transcription degrades with repeated [unclear], exhausting output tokens (recent regression)

Ishita · January 16, 2026, 4:17am

I’m seeing a reliability regression when using Flash 2.5 on Vertex AI using batching for audio transcription via the google-genai package.

Over the last 1–1.5 months, the model has begun repeatedly emitting the literal token [unclear] for unclear or noisy portions of audio. This repetition consumes the full max_output_tokens budget before the transcription completes, which results in:

Truncated or malformed JSON output
Failure to satisfy the configured response_schema
High overall transcription error rates

This same pipeline was noticeably more stable prior to this timeframe, with significantly fewer [unclear] repetitions and successful completion of structured JSON responses.

Setup details:

Platform: Vertex AI
Model: Flash 2.5
Task: Audio transcription
Output format: application/json
Response schema: Enabled (JSON Schema)
Thinking mode: Enabled

Observed behavior:

[unclear] is emitted repeatedly instead of being minimized or aggregated
Output tokens are exhausted before transcription completes
JSON response is cut off mid-generation

Increasing max_output_tokens reduces but does not fully resolve the issue.

Expected behavior:

[unclear] output should be limited or aggregated
The model should prioritize completing a valid JSON response
Transcription should complete reliably within the token budget

Questions for the community / Google team:

Has there been a recent change or regression in Flash 2.5 transcription behavior?
Is repeated [unclear] output expected for this model?
Are there recommended mitigations (prompting, schema changes, chunking, model choice)?
Is Flash 2.5 currently recommended for transcription workloads on Vertex AI?

Hammad_Randhawa · April 3, 2026, 1:32pm

Hey Google Team, Can anyone help in this issue as I am facing this issue as well

Topic		Replies	Views
Bug Report the model often starts creating repetitive sequences of tokens Gemini API gemini-15	15	1716	February 5, 2026
Gemini 2.5 Flash quality degradation Gemini API feedback , models	1	213	June 18, 2025
Truncated Response Issue with Gemini 2.5 Flash Preview Gemini API bug , gemini-flash	55	5018	October 1, 2025
2.5 flash audio native - output broken in DE Gemini API models	8	553	October 18, 2025
The response of gemini-2.5-flash does not have both candidates and finishReason frequently Gemini API gemini-flash-2-5	4	777	June 6, 2025

Flash 2.5 (Vertex AI) transcription degrades with repeated [unclear], exhausting output tokens (recent regression)

Related topics