Critical: Persistent 429 Resource Exhausted Error on Vertex AI After Migrating from AI Studio

Hi everyone,

I am experiencing a persistent production issue. I recently migrated my agentic workflow from Google AI Studio to Google Vertex AI (GCP) to ensure production-grade stability. My stack consists of the Agno framework, and the application is hosted on Railway.

Despite being on a paid GCP project, I am constantly hitting 429: Resource has been exhausted errors, leaving my customers without service.

Error Log:

[2026-03-16 22:40:11,497: WARNING] Malu returned unexpected content type str for session... "code": 429, "message": "Resource has been exhausted (e.g. check quota).", "status": "RESOURCE_EXHAUSTED"

Key Details:

  • Framework: Agno (my main agent, “Malu”, manages complex tasks).

  • Infrastructure: Hosted on Railway.

  • Context: This started happening more frequently after the migration to Vertex AI.

I need urgent help with:

  1. Quota Increases: Are there specific “Quotas & System Limits” in the GCP Console that I should look for to increase RPM/TPM for Gemini specifically for Vertex AI?

  2. Environment Conflict: Could the fact that I am running on Railway (cloud environment) be affecting how Vertex AI throttles my requests?

I am at a point where I’ll have to migrate to Claude or OpenAI if I can’t stabilize this, as the complaints from my users are piling up. Any help would be greatly appreciated.

Best regards,