Hi Google AI Team,
Requesting a DSQ allocation increase for gemini-2.5-flash-image-ga on Vertex AI.
Project details:
- GCP Project ID: project-76407e5f-0467-4031-…
- Model: gemini-2.5-flash-image (gemini-2.5-flash-image-ga)
- Service: aiplatform.googleapis.com
- Regions needed: us-central1, us-east1, us-east4, us-west1, europe-west4, global
- Billing: Enabled (active paid account)
- Vertex AI API: Enabled
Current error:
429 RESOURCE_EXHAUSTED: Quota exceeded for
aiplatform.googleapis.com/online_prediction_requests_per_base_model
with base model: gemini-2.5-flash-image
I confirmed no editable quota row exists (DSQ model) and the
support cases tab is blocked on my support tier.
Use case:
I am building a production AI wardrobe app (REMEMBR) that generates
flat-lay clothing icons using gemini-2.5-flash-image in image-edit mode.
Each user scan processes 5–15 items concurrently across 7 regions.
I need approximately 60–100 RPM per region to support 20–50 concurrent
users scanning simultaneously. Currently hitting 429s on any burst
above 2–3 parallel requests, causing 13–16 minute scan delays and
fallback to lower-quality text-to-image.
Requested allocation: 100 requests per minute per region across
us-central1, us-east1, us-east4, us-west1, europe-west4, and global endpoint.
Thank you