I’m seeing a quota/billing mismatch on the Gemini Developer API.
Environment
- Platform: Google AI Studio / Gemini Developer API
- Billing: Paid, Tier 2, Postpay
- Project: National Niner Gemini DEV
- Project type: normal manually created GCP project, then imported into AI Studio
- API key created in that paid project
- Region: global
- Client: Python via LangChain ChatGoogleGenerativeAI
- Reproduces across fresh keys and fresh projects
Problem
All generateContent calls fail with 429 RESOURCE_EXHAUSTED, but the quota failure is for FREE TIER metrics with limit 0, even though the project is paid Tier 2.
Example error
- quotaMetric: generativelanguage.googleapis.com/generate_content_free_tier_input_token_count
- quotaMetric: generativelanguage.googleapis.com/generate_content_free_tier_requests
- limit: 0
Models tested
- gemini-3-pro-preview / gemini-3.1-pro
- gemini-2.5-pro
Both fail with the same free-tier limit: 0 behavior.
What I already tried
- Verified the key is from the exact paid Tier 2 Postpay project in AI Studio.
- Created a brand-new non-autogenerated GCP project in Cloud Console.
- Linked the billing account to that project.
- Imported that project into AI Studio.
- Created a fresh API key in that imported paid project.
- Retried with different models.
- Reduced local request rate and retried.
- Reproduced on the first call, so this does not appear to be burst/rate behavior.
Observed result
Requests are still evaluated against:
- GenerateRequestsPerMinutePerProjectPerModel-FreeTier
- GenerateRequestsPerDayPerProjectPerModel-FreeTier
- GenerateContentInputTokensPerModelPerMinute-FreeTier
- GenerateContentInputTokensPerModelPerDay-FreeTier
Expected result
A paid Tier 2 Postpay project should use paid quota buckets, not free-tier buckets with limit 0.
Question
Can someone from Google confirm whether this is a backend billing/quota sync issue for the project, and what the fix is? If needed, I can provide:
- project ID privately
- minimal direct API repro outside LangChain






