Concurrent requests handling

I’m working on a web app for audio transcription using the Gemini API. It uploads audio files for analysis and gets text files in return. But I’m running into issues with concurrent uploads. When I use multiple API keys, I get a 403 error (permission denied). And if I send everything through one key, I get a 503 (service unavailable). How do you usually handle concurrent API requests in your app? Any tips?

Welcome to the forum.
The API is stateless. There is no possible conflict due to concurrency, each request is atomic. Therefore, client applications don’t need to take special precautions.

The overall traffic flow from a project is of course subject to that project’s arrival rate limits. To deal with rate limits, the recommended approach is to check for 429 http return codes and perform retry with exponential backoff, the normal congestion control approach.

Hope that helps.