Following the pattern from a similar thread here (allowlist request for gemini-3.1-flash-tts-preview), I’m hitting a quota issue that looks like a provisioning gap rather than an actual limit.
Project details
- Billing: Active, confirmed Tier 2 · Postpay at ai.dev/rate-limit
- Model:
gemini-3.1-flash-tts-preview - API: Gemini Developer API,
generateContentwithresponseModalities: ["AUDIO"]
Issue
My project shows Tier 2 in the AI Studio dashboard, but TTS requests keep failing with a 429 citing the free-tier bucket:
Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests,
limit: 3, model: gemini-3.1-flash-tts
I pulled the raw quota config directly (gcloud alpha services quota list --service=``generativelanguage.googleapis.com`` --consumer=projects/kt-cloud-437816) and found both buckets exist for this model on my project:
| Metric | effectiveLimit |
|---|---|
generate_content_free_tier_requests |
3 (this is the one firing) |
generate_content_paid_tier_2_requests |
1000 (matches my dashboard, not being used) |
So the correct paid-tier bucket is provisioned, but the old free-tier one seems to still be enforced in parallel instead of being superseded after the tier upgrade.
Ask
Could someone take a look at why the free-tier bucket is still active for this model/project, and get the paid_tier_2 bucket (1000 RPM) to actually apply? Happy to share request IDs / more logs if useful.