With OpenAI API for example, the rate limit info is in the header:
return {
"rpm": int(headers["x-ratelimit-limit-requests"]),
"tpm": int(headers["x-ratelimit-limit-tokens"]),
}
Is there an equivalent, ideally through the google.generativeai
python sdk?
Context: My account lists a much higher RPM than I can get in reality before hitting the resource exhausted exception.