Gemini-2.5-flash generates infinite token sequences

For quite some time, I’ve noticed that the Gemini 2.5 Flash series (both Flash and Flash Lite) tends to generate unusually long token sequences — sometimes reaching a maximum width, other times seemingly infinite.

  • LangSmith trace: :slightly_smiling_face:

  • My parameters: :slightly_smiling_face:

    llm = ChatGoogleGenerativeAI(
        model=model.model_name,
        response_mime_type="application/json",
        thinking_budget=0
    )
    
    • thinking_budget=0 for fast, straightforward replies
    • temperature left to default (around 0.7, I assume)

Hi @codeonym, thanks for reaching out!

Could you please let us know what you are trying to achieve?

Is this always happening, or only when you send specific prompts?

If possible, can you provide steps to reproduce so I can try on my end?

Hi there @Srikanta_K_N , ok sure thing, here is a link to langsmith trace for debugging: LangSmith (I’ve included only the relevant part)

Q:

Could you please let us know what you are trying to achieve?

A:

It’s a data refinement wokrflow (MD/HTML artifact) using structured output

Q:

Is this always happening, or only when you send specific prompts?

A:

Yes, Almost all the time when refining the HTML/MD artifact.

@Srikanta_K_N here is another trace for debugging (max tokens reached): LangSmith

I’m also having this issue. I’ve found that 2.5 Flash starts generating infinite \n or \t when trying to generate non-english characters in a structured output. My understanding is that this was verified and reproduced, and fixed in gemini-2.5-flash-preview-09-2025. I’ve tried it and it seems to work fine.

However, the gemini-2.5-flash-preview-09-2025 does not respect thinking_budget = 0 in combination with structured output. This has also been reported by several users.

The only idea I have left on how to handle this is to add stop sequences for the most common cases, and try to rerun these requests at a later point when this is fixed.

1 Like

I’ve switched to the preview model and I haven’t encountered that error yet, thanks for pointing that out