500 Internal Server Error while trying with API

They don’t care at all. These past weeks, they ket rolling new updates and adding features instead of fixing what’s broken.

I may have resolved my own issue by enabling the Vertex API: Google Cloud console

Thanks for the tip. I’m a paying customer, but I’ve always used the AI ​​Studio API key. Now I’ve set up gcloud and vertex, and the problems have disappeared.

It’s really not their fault. People keep posting the same “500 error” response over and over again — a report that contains no useful information at all. Even after the mod specifically asked for the payload and parameters of request TWICE, the replies are still just the same vague error message. I don’t know how the developers are supposed to pinpoint the issue under those.

@flowring_luyiourwong It failed for me in the most basic curl API test. The issue is that the Gemini API is providing a useless error message when there’s something wrong upstream (in my case the Vertex API issue), and probably not that developers aren’t providing context (because there isn’t any more context to share).

I have been using Gemini 2.5 Pro API in Firebase Studio AI, but for the past few days I have been constantly facing blocking errors, namely:

  1. “The model is overloaded. Please try again later.”
  2. “An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting

These issues have completely paralyzed my work.
I am also attaching a screenshot of the errors for your reference.

I kindly ask for your assistance in resolving this problem as soon as possible.

Thank you in advance!

Hello Everyone,

Thank you for reporting. The team is aware of the 500s issue and is actively working on a fix.

Google fixed this issue already, but now it feels like they just put gemini-2.0-flash under 2.5-pro api call. It responds exactly like 2.0-flash. I tested it with some old prompts that 2.5 pro responded in about a minute and a half back then, and now it responds in about 30 seconds - exactly like 2.0-flash back then. Could someone explain this to me?