Error: The model is overloaded

I am getting this error, and I’m not sure what it means. It’s a 503, suggesting it is a server-side error. I’m using the free tier. I’m within the rate limits of 15 calls/minute and 1 million tokens / minute. I’m using the JavaScript/TypeScript SDK.

error: [GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-002:generateContent: [503 Service Unavailable] The model is overloaded. Please try again later.

Does anyone know what this error means? Is it maybe the case that the free tier is only available when paid tier usage is low?

Welcome to the forums!

As the 50x level indicates - this is an error on Google’s side. It usually means that something went wrong in how they’re dealing with something internally. Sometimes in calling a parallel service to Gemini, but not Gemini itself.

I’m seeing more and more reports of something like this. Can you provide more details or a concrete example of code you’re calling that is is triggering it? Are you using any tools? Large prompts? Media?

In addition to a user text prompt, I’m using cached context with a large PDF and system instructions. I’m trying to accurately extract information from research studies.