Gemini-2.0-Flash Response is returning "429" even with peak usage to be less than 1%

Hi team,
In recent one week, we’ve been seeing this issue happening multiple times when calling methond google.ai.generativelanguage.v1beta.GenerativeService.StreamGenerateContent. Can you pls check why we’re getting this 429 issue related to model Gemini-2.0-flash-lite?

We’re seeing Error with message in prod 429 RESOURCE_EXHAUSTED. {'error': ('code': 429, 'message': 'Resource has been exhausted (e.g. check quota)', 'status': 'RESOURCE_EXHAUSTED'
When running code:

 response = genai_client.models.generate_content_stream(
                model=model,
                contents=history,
                config=config,
            )
            return 
  self._handle_generate_stream_response(model, credentials, response, prompt_messages)

It’s affecting our prod env but I checked from “API/Service Details” the usage at peak is < 1%. It makes no sense.

Hi @Xiaoyu_Shawn_Que,

We have recently updated error message for 429 error that specifies which rate limits are being exceeded. Could you please retry and verify on your end?

If you still face any issues, please feel free to post entire error message, so we can help you better.

Thank you!