Gemini 2.0 Async Endpoint leading to 429, but Sync doesn't

I am currently using the VertexAI API and have synchronous scripts setup and working well. However, I would like to set them up to use the asynchronous endpoints, which is giving me repeated 429 errors even on my very first call, leading me to think the models are down or the endpoints are different. My example code is the following.

from google import genai 

client = genai.Client(
        vertexai=True, 
        http_options=HttpOptions(api_version="v1"),
        location="us-east5",
        project="proj-name"
)

response = await client.aio.models.generate_content(
                model="gemini-2.0-flash",
...
)

The exact version of this with client.models.generate_content() and same API account runs smoothly, so I’m confused why i’m getting a 429 RESOURCE_EXHAUSTED error even on my first run, especially when i’m using semaphore to limit the async calls as well.

Hi @Tyler_Zhu

Welcome to the forum.

Apologies, I’m confused by your source code.
Why do you specify the http_options parameter? Using Vertex AI implicitly uses the v1 version of the API. However, the use of client.aio suggests that you’re experimenting with the latest v1alpha version. Furthermore,. AFAIK, using streaming is based on the method generate_content_streaming.

The HTTP 429 response indicates an issue with the quota. Try using a different model first, then if the issue persists, request an increase of quota on GCP.

Cheers.

Hi @jkirstaetter,

Thank you for the response, and sorry for the late reply. I was just supplying http_options based on tutorials. I’m not tring to do streaming actually, I’m simply trying to issue many requests at the same time in a similar manner to how GPT lets us do async (distinct from streaming).

I found that importing vertexai actually does what I want, i.e.

model = GenerativeModel("gemini-2.0-flash")
response = await model.generate_content_async(prompt)
return response.text

You suggest somewhere else that this is an old way of doing this however. What is the genai equivalent of this?

Hi @Tyler_Zhu

Maybe you start looking at and running (on Colab) the following Jupyter Notebook to get started with Gemini:

It covers all aspects of the new genai package and uses Vertex AI as the API endpoints.

Cheers

1 Like