2.0-flash request considered as 2.5-flash in quotas

Since this morning we noticed our gemini-2.0-flash ended with RESOURCE_EXHAUSTED response and appeared to trigger a PerDayPerProjectPerModel quota limit violation.

After reviewing the “Usage and Billing” tab it appears all request were logged as gemini-2.5-flash ones, and the dashboard says we reached it’s limit, even though the 429 response clearly stated we were using 2.0.

The weird part is that even with the model supposedly show as capped for today, we’re still able to get successful responses when testing gemini-2.5-flash, with the RPD still incrementing past limits.

Response :

{
       "quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_requests",
       "quotaId": "GenerateRequestsPerDayPerProjectPerModel-FreeTier",
       "quotaDimensions": {
         "location": "global",
         "model": "gemini-2.0-flash"
       }
     }

Hii @Guillaume_Volumiq
Welcome to the Google AI Forum!!!

Thank you for bringing this to our attention.
Apologies for the delayed response. Could you please confirm if you are still facing the same problem?