Overview
I have observed instability in the Gemini API coinciding with the release of new model versions by Google. These disruptions directly impact production applications, especially those relying on features like function calling and low-latency responses.
Issue After Gemini 2.5 Pro Release
After Google released Gemini 2.5 Pro, the function-calling feature in Gemini 2.0 Flash started failing intermittently for around three days. This happened without any changes to the code or app, causing inconsistent behavior that is unacceptable for production environments.
Similar Problem After 2.0 Flash Launch
A similar issue occurred when Gemini 2.0 Flash was introduced:
- Apps using Gemini 1.5 Pro went from millisecond response times to 15+ seconds for the same input.
- This lasted about two days and resolved on its own, again without code changes.
This pattern suggests that new model rollouts are impacting older models, even if they’re still actively in use.
Why It Matters
Unreliable performance during model transitions
Sudden latency spikes
No changes from the user side
Unexpected behavior in production
Such instability makes it difficult to depend on Gemini for critical use cases.
Community Feedback
This issue aligns with others in the community, as discussed in this Google AI forum thread, where developers also reported major slowdowns.
Final Note
Correct me if I’m wrong, but I’ve noticed this issue happening multiple times specifically during new model releases—and each time, the problem seems to resolve on its own without any change from my side. That kind of behavior makes it hard to trust the API for stable production use.
Can any one help please!