For those using Flash - noticed recent errors?

The app we built for this competition (recime.ai) uses “gemini-1.5-flash” as the Gemini model.

It seems over the last week Google quietly changed this to use the newer 002 model instead of the previous 001 model, resulting in errors in our app.

This newer 002 model seems dumber than the 001 model (maybe it’s cheaper for Google to run and/or uses less parameters, so they’ve opted for 002 over 001 as default).

Anyone else using Google Gemini Flash API and noticed something similar this past week?

2 Likes

I get model overloaded errors. Sometimes, I get unfinished responses.

1 Like

I’m glad that’s not just me then.
I get model-overloaded errors.

How can Google provide an API and not allocate the necessary resources for worldwide usage? That’s weird.

I thought that changing to the PRO versions would make it better, but I may not even try that after seeing this topic.

Does anyone know if they have an API status page?

When I googled it, I found the status page for a random company named Gemini haha.

Weird that they don’t even provide a basic status page or an easy way to report these bugs.

1 Like

I get model overloaded a lot too, especially last weekend. I suspect its because of the free tier for dev work, that attracts a lot of usage.

I’ve also experienced model overloaded errors. I think it’s due to the dynamic shared quota. It throws an 429 error, in particular in the afternoons, using the region europe-west1. Changing to another region or retrying a while later solves the problem.

What I’m more concerned about it’s that I noticed an abrupt change this week in audio sentiment analysis. I’m analyzing calls from call center, last time gemini reported a call with negative sentiment was on November 15. I’ve changed nothing in the prompt, and the model version it’s pinned to gemini-1.5-flash-002. Call that would be usually marked with negative sentiment are now all neutral sentiment. Anyone else also has seen this?

1 Like

lol “dumber”. Yes, I noticed it too, and switched to competitor models.

I wrote all my Gemini code in a way that allowed me to switch to different models quickly. This way, I can easily switch between GPT, Claude, or Gemini.

Yes, this requires more code maintenance but it gives me peace of mind that I can switch to different models when one isn’t working as expected.

A month or 2 ago, Gemini was failing for users from us-central location servers for example. It was failing for all paid/unpaid users. That’s when I decided to do this.

You can’t 100% rely on ANY model to work perfectly at all times BUT it seems like the Gemini team experiments WAY too much with their models and subsequently makes it unreliable for apps in production far too often.

2 Likes