I am currently using the VertexAI API and have synchronous scripts setup and working well. However, I would like to set them up to use the asynchronous endpoints, which is giving me repeated 429 errors even on my very first call, leading me to think the models are down or the endpoints are different. My example code is the following.
from google import genai
client = genai.Client(
vertexai=True,
http_options=HttpOptions(api_version="v1"),
location="us-east5",
project="proj-name"
)
response = await client.aio.models.generate_content(
model="gemini-2.0-flash",
...
)
The exact version of this with client.models.generate_content()
and same API account runs smoothly, so I’m confused why i’m getting a 429 RESOURCE_EXHAUSTED error even on my first run, especially when i’m using semaphore to limit the async calls as well.