Error: The model is overloaded

My service is down getting this since last night for every request : GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent: [503 Service Unavailable] The model is overloaded. Please try again later.

Same here. Most requests are getting this 503 error. I tried switching to the VertexAI (@google-cloud/vertexai) package to see if that made a difference since the @google/generative-ai package seems less actively developed. That seemed to work, but I had limited success fully testing since the API limits are unfortunately defaulted to 5 RPM. I filed a quota increase request to match the 1000 RPM you get with the paid tier using the generative-ai package, and will see if switching to that makes a difference whenever that quota request goes through.

All in all, extremely frustrating though. No indication anywhere from Google that there’s an issue, and multiple different poorly documented client SDKs that seemingly have different behaviors.

A few other things I tried with no success:

  • switching where I was calling the SDK to a different region (this worked to fix a similar temporary issue that happened 6 months ago, no luck this time though)
  • switching from “gemini-1.5-pro-latest” to “gemini-1.5-pro-002”
  • switching the api version from the default “v1beta” to “v1” (this failed for me because JSON support doesn’t seem to exist in v1, and that’s pretty critical for using this programatically)

Switching down to gemini-1.5-pro-001 seems to be working for me for now, but given that 002 is in theory a stable model that’s a pretty poor outcome.

how did u resolved it :sob:

If you are facing the same error, then the problem is not with your codebase. it is happening cause of the high demand of the model, you have to wait for sometime or change the model (take openAI model for 5$ it is worth it).

For my case, I solved it by cleaning up all files that I upload to file api.

Here’s we have 20G hard limit for storage that we can upload files.

Getting the same issue:

503 UNAVAILABLE. {‘error’: {‘code’: 503, ‘message’: ‘The model is overloaded. Please try again later.’, ‘status’: ‘UNAVAILABLE’}}

Hello guys,

Just getting the same error message randomly.

Service unavailable - try again later or consider setting this node to retry automatically (in the node settings)

[GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent: [503 Service Unavailable] The model is overloaded. Please try again later.

Any insights about this issue? Has been happening a lot lastly.

This is happening like the 70% of the time, this is such a shame, i paid over 250€ for Adv for my App on the PlayStore and almost half of the user just deleted the application.


This is happening like the 70% of the time, this is such a shame, I want to start my app but almost before release I noticed error happening so much time. Fix it!!! For know I have to search for other model providers .

Having the same error as well. No matter the model I choose, I’m on free tier as well and using ADK.

execution: 503 UNAVAILABLE. {‘error’: {‘code’: 503, ‘message’: ‘The model is overloaded. Please try again later.’, ‘status’: ‘UNAVAILABLE’}}"

Getting same errors here (model overload) I’ve changed env. variable for location from us-central1 to us-east4. Seemed to work for a bit and then same errors. switched models from 2.5 pro to 2.5 flash. Same issues.

I’ve found the Gemini output to be the best but even though I’m spending my API money with them, I’m getting a lot of failures due to this. Yes I could run it again but it’s not something I would like to push to production and may need to consider another model from another company for a while .. Gemini is great but this reliability is concerning

The model is currently overloaded. Please try again later

Again again and again

It’s unbelievable and unacceptable that a company like Google is not large enough or does not have the desire to solve this chronic problem. Move it!

Hi all,

As we have so many models, can you report using the following structure:

1. Which model is being used that caused the error?

2. Which platform are you using the model?

3. Which region are they trying from?

4. Which tier are they on?

5. Paste the exact error message.

Hello, I am responding to your message to say that now, but actually for more than a week, fortunately everything has been sorted out, at least as far as I am concerned. I don’t know about in general, but for me, yes, the prompt responds correctly and at the right speed. I’m very happy that the developers have managed to fix the problem, whether they did something or whether the problem in general has been resolved in some way. I just hope that this problem doesn’t recur in the future, but I remain hopeful that it won’t. Best regards and have a good day :waving_hand:

I dont know if this helps anyone but, I had the same issue when making parallel or sequential requests in bulk, turns out the error happened because I opened clients with those requests aswell which caused the 503 error so it doesnt always mean there is a problem with Google’s infra

Hey there i’‘m also getting same error, i’m just planning to integrate AI in my editor for “Ask AI“, “Continue Writing“ i was used free tier google gemini API key the problem occurring when i keep on raising request i got the message from server modal overloaded try again that’s the exact scenario i was using if i tried after some time i’ll get the output the real problem today evening i need to give a demo to my client so that’s the main prob right now if there is any options to resolve please throw the answer….

Here we are again. I would have gladly done without it, and things seemed to be back to normal, but today, right now, I had to stop because the usual problems came back, the usual message “the model is currently overloaded” and stuff like that. You never completely solved the problem, and clearly those who have the power to do so don’t care about doing anything about it. I have the free plan and I saw that you lowered the limit from 250 to 20, which is your legitimate choice, but if I can’t even use those 20 and to use the prompt I have to make continuous unsuccessful attempts to make a single attempt, then 20 is nothing, they run out in less than a minute.
Fortunately, I’m in the process of closing, so I won’t need this stuff for much longer, but the way you allow it to be used is simply unacceptable. It might as well not exist. It makes me not want to have anything to do with it. I really hope that one day the difficulties will be such that those in charge will decide to reprogram everything and do a reset, perhaps solving all the underlying problems. As things stand, there is no desire to improve things, and I am very sorry about that. I am really sorry that things are like this :disappointed_face:

Also getting the same error just now - using it in apps script to feed a google doc to gemini and have it generate content

Hi @Ivan_Saric , @Gaetano_Sciara @Marudhu_Pandiyan @James_McCabe @Kapenge_Lenco @Ronny_Rodriguez
A 503 error typically indicates a temporary server-side issue, particularly on the free tier, which often resolves automatically after a short period. However, if you continue to experience this issue, please provide the following details to help us understand the context:

  • Billing Tier

  • Model name

  • Region

  • Platform (AI Studio/Gemini API SDK)

  • Complete Error Message

Additionally, it would be beneficial if you could briefly describe the task you are using the API for, or provide the exact prompts used.