Nano Banana Pro Vertex AI API unreliable

I developed an app with Nano Banana Pro API served via the Vertex AI API but the service is so unreliable that my app is unusable.

Yesterday images were being generated at lightening fast speeds and today its a constant resource exhausted error. I understand the feature is in preview mode but can we justify this level of instability?

INFO:backoff:Backing off _attempt_image_generation(…) for 0.5s (google.genai.errors.ClientError: 429 RESOURCE_EXHAUSTED. {‘error’: {‘code’: 429, ‘message’: ‘Resource exhausted. Please try again later. Please refer to Error code 429  |  Generative AI on Vertex AI  |  Google Cloud Documentation for more details.’, ‘status’: ‘RESOURCE_EXHAUSTED’}})

Can someone from Vertex AI please add some TPU’s for improving the inference service? Slow is fine but throwing 429 errors is making it impossible to serve my users.

INFO:backoff:Backing off _attempt_image_generation(…) for 0.5s (google.genai.errors.ClientError: 429 RESOURCE_EXHAUSTED. {‘error’: {‘code’: 429, ‘message’: ‘Resource exhausted. Please try again later. Please refer to Error code 429  |  Generative AI on Vertex AI  |  Google Cloud Documentation for more details.’, ‘status’: ‘RESOURCE_EXHAUSTED’}})

Hi @DataSiens_AI ,

To help us understand your issue better could you please share full 429 response along with a screenshot and tier details?

this is the API response, there is no screenshot.

File “/code/.venv/lib/python3.11/site-packages/google/genai/errors.py”, line 131, in raise_error
raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 429 RESOURCE_EXHAUSTED. {‘error’: {‘code’: 429, ‘message’: ‘Resource exhausted. Please try again later. Please refer to Error code 429  |  Generative AI on Vertex AI  |  Google Cloud Documentation for more details.’, ‘status’: ‘RESOURCE_EXHAUSTED’}}

I made a simple switch to Gemini API and these issues no longer appear.

os.environ[“GOOGLE_GENAI_USE_VERTEXAI”] = “False”

which makes it clear that the issue is with Vertex AI API and probably resources allocated to these models on this path.

My quotas for these services are also within limits. The usage was extremely sparse and such errors should not have been encountered.

Hi @DataSiens_AI , We apologize for the inconvenience. Could you please provide the project number (not the project ID) via direct message if you have not yet done so?

Thanks Sonali. I do not see any DM feature on this forum. I visited your profile and there is nothing along those lines. Please guide me.

@DataSiens_AI, Click on my Profile Picture or Username, a small pop-up will appear. Then, click the Message button and share your project number there. I’ve attached a screenshot for your reference.

Hi @DataSiens_AI , We truly appreciate you flagging this issue and apologize for the issues. We have escalated this issue to our internal team for further investigation.

@DataSiens_AI, A fix has been pushed that should resolve the problem. Please let us know if you are still experiencing any issues.

This is so unreliable, 5-6 concurrent requests and it hits 429 whereas on gemini i do not encounter the same

I have had the same issue for a couple of days I can NOT generate more then 4 to 6 images per day!!! per day! 6 images I can NOT generate - always getting the exhausted red ,essage - this i can NOT work with at all - and IN MY Google AI Cloud I ma NOT even able to load an image to edit and it gets an error with red box all over it - I have already in Paid account and I am NOT able to build my project at all - I can NOT generate images.

Hi @Sonali_Kumari1

I am also facing the same issues. With gemini 2.5 flash image it somewhat works for me but with nano banana 2 + Pro I can hardly produce one image without facing a 429. It’s a shame as these models are part of critical business processes and it makes the app almost unusable. Switching to the developer API also is not an option for me as we are working with enterprises with sensitive data.

Not sure if it’s related but we have also spent enough to supposedly be in Tier 2 of the Vertex API but that was not reflected either.

Hello,
Am facing the same issue, for an api that planned to run in prod is big issue of us, did you found a solution?, some article mention that the quota should be increased, but no idea which one.

@Ilias_Loudrassi

There isn’t one really. They push you to buy provisioned throughput which is too expensive for us initially given our current demand. Pretty crazy to me that such a basic part of their cloud service does not work.

I’m hoping that they start allocating more compute resources to their cloud solution as their developer API works fine.