Gemini-2.0-flash intelligence regression (and parallel tools?)

Over the last 48 hours or so we’ve noticed a significant regression in our app’s “decision making” when using gemini-2.0-flash.

As with all things AI this is hard to quantify, but in particular here is what has been noted:

  1. Prior to 2025-05-03, we had never seen gemini-2.0-flash select multiple tools. After that date, and with no code changes on our end, it began selecting multiple tools (its possible this was occurring over the weekend too and we missed it).
  2. At the same time the model became less attuned to its system prompts.
  3. Its decision making for tool calls is significantly degraded, often selecting conflicting tools or violating its system prompts when performing the selection.

The regression is significant enough that we’ve been forced to switch the production app to OpenAI’s models — we’re looking for clarity because we’d prefer to use 2.0-flash if possible.

Is there anyway for us to select a stable sub-version of the model to prevent these regressions in production apps (I see for Vertex AI there is one, but it doesn’t look like a new sub-model was published there either)?