Hi,
I’m preparing a new AI companion chat service at my company. We currently have a Gemini API Paid Tier 3 account on Google AI Studio, with an input TPM limit of 8M for Gemini 2.5 Pro.
Due to the nature of our service (high-context character chat with ~13,000 input tokens per turn), we anticipate needing 40–50M input TPM to handle peak traffic at launch (targeting 33,000 MAU).
However, the AI Studio Tier 3 quota edit form caps at 8M and won’t allow a higher value.
On the Vertex AI side, the Standard PayGo Tier 3 documentation shows only 2M TPM for Gemini Pro models, with a note saying “contact your account team for a custom tier” for higher throughput needs.
My questions:
-
How do I reach the “account team” referenced in the Vertex AI documentation? I couldn’t find any direct contact channel other than https://cloud.google.com/contact, and the sales manager I was connected to through that form said it wasn’t their department.
-
Is there a way to request input TPM beyond 8M on Google AI Studio (Generative Language API), or is Vertex AI migration required for custom tier negotiations?
-
If migrating to Vertex AI is necessary, does our existing Tier 3 status on AI Studio carry over, or do we start from Tier 1 on Vertex AI?
Any guidance on the correct escalation path would be greatly appreciated. Thank you.