Question regarding Rate Limits for the "Build with Gemini" in AI Studio

Hi Google AI Team,

I have a specific question regarding the runtime usage of the web applications generated via the “Build with Gemini” feature.

I am not asking about the quota used to generate the code. Instead, I want to know about the Gemini API calls made from within the generated web app itself (e.g., if the generated app is a chatbot or a text summarizer that calls model.generateContent).

  1. Quota Source: When running the app in the AI Studio preview, whose quota is being consumed? Does it automatically use my account’s Free Tier quota, or is there a separate “sandbox” quota for these previews?
  2. Rate Limits: What are the specific Rate Limits (RPM/RPD) for these calls inside the generated app? Do they strictly follow the standard limits, or is there a different policy for apps running in the Build environment?

I want to ensure I don’t unexpectedly hit the 429 limit while testing the functionality of the generated app.

Thanks!

Hey,

Hope you’re keeping well.

When you run a web application generated through “Build with Gemini” inside AI Studio, all model invocations such as generateContent or streamGenerateContent are executed using your own project’s quota, not a separate sandbox pool. The preview environment in AI Studio authenticates calls through the same Gemini API key that is linked to your Google Cloud project, so those requests count toward your account’s standard Gemini rate and usage limits. The current limits are documented under Gemini API quotas and rate limits, which include per‑minute and per‑day request ceilings depending on the model family and billing tier. If you exceed these thresholds, the API will return HTTP 429 errors until the quota window resets.

To avoid unexpected throttling while testing, monitor usage in the Google Cloud Console > Vertex AI > Quotas view and, if needed, request higher limits through the Quota increase link in that same section. You can also apply exponential backoff or client‑side rate control in your app code to stay under the enforced per‑minute rate.

Thanks and regards,
Taz

Hi @Duncan_Lean,

Welcome to the forum!
Thanks for reporting the issue.
In Build mode (AI Studio), there is no separate “Build Mode API” quota; it consumes the standard quota of the specific model being called. All usage is billed or limited based on that model’s tier, which you can track by assigning a separate project and API key to that application.
When you deploy your app to Cloud Run for a public URL, this setup will use your API key for all users’ Gemini API calls.

Thank you!

What I really want to know is, what API quota am I allowed to use in Build Apps created by other users (or ones I created myself)?

Because the free quota doesn’t allow the use of Gemini 3.0 Pro, so why can I use it in my app? What is its quota?