Naono banana pro API Response Delays and 503 Errors in Production

I built a mobile app using the Nano Bana Pro API key from the Gemini API to test prompt generation and image responses.

Initial Setup (Testing Phase)
Integrated the Gemini API directly into the frontend and Everything worked perfectly during testing, Response time was under 1 minute per image.

Production Issues

When I moved the app to production, I started facing serious issues:After a few production requests, response time increased to 7–8 minutes per image. This slowdown happened even though the same setup worked fine during testing

I initially deployed the backend using App Hosting from firebase, After seeing delays, I switched to another backend provider,The issue still persisted, After a few requests, response times again increased to 7–8 minutes

it seem like Google might be: Detecting the request source and Throttling requests dynamically Or enforcing some hidden production limits

Errors Encountered (Gemini API)

After continued usage, I started receiving the following error:

{
  "error": {
    "code": 503,
    "message": "The model is overloaded. Please try again later.",
    "status": "UNAVAILABLE"
  }
}


Switch to Vertex AI

Due to the instability, I migrated the app to Vertex AI and similar problems occurred in it well:
Response times again increased to 7–8 minutes, After a few requests, I started receiving 503 errors

Example error response:

{
  "error": {
    "code": 503,
    "message": "The service is currently unavailable.",
    "status": "UNAVAILABLE"
  }
}

What is your account level? If you are using your API keys at the project level in the free trial version, you need to upgrade your level to Tier 1. I had the same issue and resolved it in the same way. But it’s still slow compared to 1 week

I was actually a paid user for a long time and i was alredy on tier-1 however i just upgraded to tier-2, but this is a gemini api concept right? tier concept does not exist in vertex ai