I was using the Gemini API and as usually during rush hours it returns a lot of 503 unavailable responses but suddenly it started giving me quota limit reached errors, even though I only had around seven successful responses
This could be a temporary issue occurring during peak usage periods, especially with long-context requests. We recommend temporarily directing traffic to an alternative model to see if the capacity limitation is specific to this model.
For more detailed information, please refer to this document.