I keep getting a 429 error after more than 3–4 requests per minute, even though I should be allowed thousands more


I’m building an application to extract data from files using Instructor. However, as soon as I upload two PDF files with 2 pages each, I immediately get a 429 error, even though I’m on a paid tier. I estimate my usage to be around 10,000 tokens per minute at most, and only 5–6 requests per minute.

I can upload at most one single-page file before hitting the 429 error. This should be well below even the free tier limits. I’ve only noticed this issue when working with file or image inputs. For comparison, I’ve previously sent over 1,000 text-converted PDFs in the same way without ever getting a 429.

How can I resolve this?

1 Like

Had the same issue in past. They are shipping new models and this often happens when they release new models. It will be resolved after some time.

It’s been at least a month that I’ve had this problem.

We are seeing this often as well. Also 503 errors. Today we had error rates of 50% across thousands of queries for 6+ hours, even though we’re well under paid quota. We try to handle it with lots of retries and backoffs but with failure rates that high it becomes somewhat challenging to use in OCR pipelines. I’m especially surprised because this was flash 2.0, which is GA.