Hey Logan and team, I don’t think this is actually fully resolved. We’re still seeing a variant of this problem in production with Gemini 3 Flash as of March 2026.
We run an AI automation platform that uses the Vercel AI SDK v6 with Gemini 3 Flash via the AI Gateway. We have a browser automation agent that uses multi-step tool calling (up to 99 steps for complex tasks like scraping ticket info from websites). The agent calls tools like navigate, click, screenshot, eval, etc.
What we’re seeing is that Gemini 3 Flash dumps its entire internal reasoning process as visible text output between tool calls. After taking a screenshot, instead of just making the next tool call, the model writes things like:
“The screenshot shows a large Filters pane on the right that’s blocking the view. I need to close the Filters pane or click View 47 Listings again. Wait, looking at the center: Showing 6 of 47 and a Show more button. Let’s try to extract from the whole page again with a more robust script. Actually, let’s just use get_text on the main container…”
This goes on for paragraphs. It also writes JavaScript code as visible text before putting the same code into an eval tool call. And sometimes it gets stuck in loops, repeating “wait, I’ll click E41, actually I just run the URL” over and over.
We checked the AI SDK source and confirmed that sendReasoning defaults to true and reasoning parts are streamed as separate reasoning-start/delta/end events. The UI properly handles those with a collapsible component. But this reasoning text is NOT coming through as reasoning parts, it’s coming through as regular text-delta events and gets stored as type: "text" in the database.
We also looked into setting thinkingConfig: { includeThoughts: true } but from what we can tell from the forums, that’s purely observational and doesn’t change the model’s behavior. The model would still output reasoning as text regardless.
This is really painful because in a 55-step browser session, the user’s chat gets flooded with the model’s internal monologue. We are not setting functionCallingConfig explicitly since we go through the AI Gateway, so the patch from @AmrAbuElyazid shouldn’t apply here.
Is there any update on the model separating its reasoning from its text output properly during multi-step function calling? This feels like the same fundamental issue, just a different symptom.