The app we built for this competition (recime.ai) uses “gemini-1.5-flash” as the Gemini model.
It seems over the last week Google quietly changed this to use the newer 002 model instead of the previous 001 model, resulting in errors in our app.
This newer 002 model seems dumber than the 001 model (maybe it’s cheaper for Google to run and/or uses less parameters, so they’ve opted for 002 over 001 as default).
Anyone else using Google Gemini Flash API and noticed something similar this past week?
I’ve also experienced model overloaded errors. I think it’s due to the dynamic shared quota. It throws an 429 error, in particular in the afternoons, using the region europe-west1. Changing to another region or retrying a while later solves the problem.
What I’m more concerned about it’s that I noticed an abrupt change this week in audio sentiment analysis. I’m analyzing calls from call center, last time gemini reported a call with negative sentiment was on November 15. I’ve changed nothing in the prompt, and the model version it’s pinned to gemini-1.5-flash-002. Call that would be usually marked with negative sentiment are now all neutral sentiment. Anyone else also has seen this?
lol “dumber”. Yes, I noticed it too, and switched to competitor models.
I wrote all my Gemini code in a way that allowed me to switch to different models quickly. This way, I can easily switch between GPT, Claude, or Gemini.
Yes, this requires more code maintenance but it gives me peace of mind that I can switch to different models when one isn’t working as expected.
A month or 2 ago, Gemini was failing for users from us-central location servers for example. It was failing for all paid/unpaid users. That’s when I decided to do this.
You can’t 100% rely on ANY model to work perfectly at all times BUT it seems like the Gemini team experiments WAY too much with their models and subsequently makes it unreliable for apps in production far too often.