We are reaching out to report a critical and repetitive issue regarding 429 RESOURCE_EXHAUSTED errors when utilizing the Google Search grounding feature.
Despite our project being on Tier 3, we are consistently blocked by this error exclusively for grounding-enabled requests. Meanwhile, our standard (non-grounded) API requests are being served perfectly without any issues.
Issue Details:
-
Error Code:
429 RESOURCE_EXHAUSTED -
Specific Feature: Vertex AI with Google Search Grounding Model 2.5 Pro
-
Current Quota Tier: Tier 3
-
Impact: This is severely impacting our production/development workflow as we are unable to use grounding features reliably.
{ “error”: “429 RESOURCE_EXHAUSTED. {‘error’: {‘code’: 429, ‘message’: ‘Resource exhausted. Please try again later. Please refer to Error code 429 | Generative AI on Vertex AI | Google Cloud Documentation for more details.’, ‘status’: ‘RESOURCE_EXHAUSTED’}}” }
- Under Quota Limits: We have actively monitored our dashboards and confirmed that our request volume is well below our Tier 3 quota limits.
- Feature-Specific: The issue only triggers when grounding is enabled. Standard requests to the same models work flawlessly, suggesting this might be a backend restriction or a separate rate limit specifically tied to the Google Search grounding component rather than our core model QPM/RPM.
We are also retrying with exponential backoffs