Penalty for reaching quota in pay-as-you-go with a fine-tuned model?

I have this fined tuned model based off of gemini-1.5-flash that I want hit using the API. Thinking that the RPM was 2000, I instructed my code to make about 1500 request per minute. Long story short I experienced a lot of RESOURCE_EXHAUSTED errors and soon only RESOURCE_EXHAUSTED errors - even with a much lower rpm and very patient try and retry logic. I then gave it a 12 hour rest and made a single request, RESOURCE_EXHAUSTED.

I have since learned that the RPM is very likely 360 for this model. Fine, I can work with that. But when can I resume using the model at the lower rate? AM I being penalised for X amount of time for reaching the quota or there is some other limitation in place? Like, maybe I have also reached some unknown daily max request?

Can anyone shed some light on this?

There shouldn’t be a request-per-day limit on the paid tier.
But if there is, it resets at midnight US West Coast time (ie - the time in Mountain View, CA).

Update: Gave this another go 12h later again and stayed well within the quota, about 300 RPM. This worked for bit but eventually all I got was RESOURCE_EXHAUSTED. The only conclusion is that there is some other limitation than RPM. I just want to know what it is so I can make plans!