Error: The model is overloaded

Same here. Most requests are getting this 503 error. I tried switching to the VertexAI (@google-cloud/vertexai) package to see if that made a difference since the @google/generative-ai package seems less actively developed. That seemed to work, but I had limited success fully testing since the API limits are unfortunately defaulted to 5 RPM. I filed a quota increase request to match the 1000 RPM you get with the paid tier using the generative-ai package, and will see if switching to that makes a difference whenever that quota request goes through.

All in all, extremely frustrating though. No indication anywhere from Google that there’s an issue, and multiple different poorly documented client SDKs that seemingly have different behaviors.

A few other things I tried with no success:

  • switching where I was calling the SDK to a different region (this worked to fix a similar temporary issue that happened 6 months ago, no luck this time though)
  • switching from “gemini-1.5-pro-latest” to “gemini-1.5-pro-002”
  • switching the api version from the default “v1beta” to “v1” (this failed for me because JSON support doesn’t seem to exist in v1, and that’s pretty critical for using this programatically)

Switching down to gemini-1.5-pro-001 seems to be working for me for now, but given that 002 is in theory a stable model that’s a pretty poor outcome.

1 Like