Daily "503 The model is overloaded" errors causing major service disruption

,

For an extended period, we have been consistently receiving the “503 The model is overloaded. Please try again later.” error. This is not an intermittent issue; it is a daily occurrence that has brought our service to a complete halt on multiple occasions.

Our application is dependent on the stability of the Gemini API. Due to these daily service interruptions, we are forced to repeatedly apologize to our customers. The lack of reliability is undermining our business and eroding customer trust.

We understand that these issues can occur, but the frequency and severity are unacceptable for a service we are building a business upon. Could you please provide an update on when we can expect the API to become stable? We need a clear timeline to make informed decisions for our service.

1 Like

Hi @blomi,

Welcome to the Google AI Forum! :confetti_ball: :confetti_ball:

Can you let me know which tier are you in?

The message “503 The model is overloaded. Please try again later.” reappeared when calling the Gemini 2.5 Flash API. my service has been switched to maintenance mode. I don’t know when it will be restored — do I just have to wait? I’m using a Tier-1 paid API.

1 Like

At the current time, “503 the model is overloaded” errors have occurred again, causing my service to be switched to maintenance mode. The 503 errors are occurring on both 2.5 Flash and 2.0 Flash Lite.