For those using Flash - noticed recent errors?

daskalou · November 20, 2024, 5:11am

The app we built for this competition (recime.ai) uses “gemini-1.5-flash” as the Gemini model.

It seems over the last week Google quietly changed this to use the newer 002 model instead of the previous 001 model, resulting in errors in our app.

This newer 002 model seems dumber than the 001 model (maybe it’s cheaper for Google to run and/or uses less parameters, so they’ve opted for 002 over 001 as default).

Anyone else using Google Gemini Flash API and noticed something similar this past week?

youtube_labels · November 20, 2024, 7:15am

I get model overloaded errors. Sometimes, I get unfinished responses.

hack_fake · November 20, 2024, 7:57am

I’m glad that’s not just me then.
I get model-overloaded errors.

How can Google provide an API and not allocate the necessary resources for worldwide usage? That’s weird.

I thought that changing to the PRO versions would make it better, but I may not even try that after seeing this topic.

Does anyone know if they have an API status page?

When I googled it, I found the status page for a random company named Gemini haha.

Weird that they don’t even provide a basic status page or an easy way to report these bugs.

jasfi · November 20, 2024, 8:17am

I get model overloaded a lot too, especially last weekend. I suspect its because of the free tier for dev work, that attracts a lot of usage.

Belen_Cebrian · November 20, 2024, 8:25am

I’ve also experienced model overloaded errors. I think it’s due to the dynamic shared quota. It throws an 429 error, in particular in the afternoons, using the region europe-west1. Changing to another region or retrying a while later solves the problem.

What I’m more concerned about it’s that I noticed an abrupt change this week in audio sentiment analysis. I’m analyzing calls from call center, last time gemini reported a call with negative sentiment was on November 15. I’ve changed nothing in the prompt, and the model version it’s pinned to gemini-1.5-flash-002. Call that would be usually marked with negative sentiment are now all neutral sentiment. Anyone else also has seen this?

luluthepooh · November 20, 2024, 9:24am

lol “dumber”. Yes, I noticed it too, and switched to competitor models.

I wrote all my Gemini code in a way that allowed me to switch to different models quickly. This way, I can easily switch between GPT, Claude, or Gemini.

Yes, this requires more code maintenance but it gives me peace of mind that I can switch to different models when one isn’t working as expected.

A month or 2 ago, Gemini was failing for users from us-central location servers for example. It was failing for all paid/unpaid users. That’s when I decided to do this.

You can’t 100% rely on ANY model to work perfectly at all times BUT it seems like the Gemini team experiments WAY too much with their models and subsequently makes it unreliable for apps in production far too often.

urvisism · January 6, 2025, 6:29am

I’ve also experienced higher latency in “gemini-1.5-flash-002” than the “gemini-1.5-flash-001” model. Their documentation says that latency is reduced by 3x, but actually it’s the vice-versa! Many time, “gemini-1.5-flash-002” model doesn’t give the response for 10 minutes and then break with “500 Internal server error” and this is the worst thing anyone can expect in production.

Topic		Replies	Views
[PARTIALLY SOLVED] Gemini models overloading with token windows of less than 20? Gemini API gemini-15 , api , models	14	1228	November 18, 2024
Cannot use gemini 1.5 flash model says overload Gemini API gemini-15 , api	4	229	February 13, 2025
Model is overloaded - Gemini API model	29	1499	April 10, 2025
Ever wondered while adoption is poor? - Model goes from one error to another Gemini API gemini-15 , api , models	2	226	December 6, 2024
Gemini 1.5-pro latest ERROR Gemini API gemini-15 , api , models	2	238	February 3, 2025

For those using Flash - noticed recent errors?

Related topics