I think I'm banned but no message from Google? AI Studio down for 6 hours now. I thought it was my video that killed it but starting a new chat just generates 'Permission denied' is something wrong? Am I banned for the long videos?

It doesn’t make sense now because the post won’t let me show what lead to this. Longer videos crashed it but the videos were WELL within the limits and the tokens failed to count several posts prior. Around 400,00 tokens fail to count and then eventually if you insert any image or media BOOM internal error. HOWEVER now all day I can’t start NEW chats either on ANY models! Am I banned?

Hi @Kyle_Hill
Welcome to the Google AI Forum!!!

Thank you for reaching out!
This seems to be a temporary issue. Please try the following steps:

  1. Refresh the AI Studio page or reopen it in a new browser window.
  2. Clear browser cache and cookies, then sign in again.
  3. Make sure you’re using a supported browser.

Now I’m getting rate limits when my text is short at a time. As for the first post here: I had to wait for 24 hours and essentially start a new chat. The chat got poisoned somehow. I didn’t do anything unusual just share things I’ve notice and learned in life especially the hard way this year. Too bad I can’t post my chat logs here as they are very interesting. :frowning: To make it brihttps://www.youtube.com/watch?v=MfZebeCEGKQ&pp=ygUjVGhlIG1vcmUgd2UgY3JhbmsgdGhlIGhhbmRsZSBiYXJuZXnSBwkJFQoBhyohjO8%3D (The More We Crank The Handle song) I had to learn too as this year I got the cream sprayed all over my face but didn’t get to the ice cream part of it.

ef I’ve learned how to ‘crank the handle’ to get to the ice cream and not throw a fit about it. I didn’t know whether to start a new thread for the rate limits risk polluting the forum or post it here as it’s part of the internal ‘issues’ I’m having recently since the launch of Gemmini 3 or more like the week before so I knew it was coming very soon unlike the riff raffs on R/Gemini that throw pie everywhere and see if it sticks. DO NOT walk in there without a haz mat suit!

Hello,

This may have occurred because the quota limit has been reached. Please refer to the documentation to confirm your current usage.

Nothing there about quota’s. I also uploaded both the text the best I could and screen shots of the article. Skip to main content

Gemini API

(Gemini API - Google AI Developers Forum)**

Gemini API DocsAPI reference

Gemini 3 Pro is here. Try it for free in Google AI Studio.

Gemini 3 Developer Guide

Gemini 3 is our most intelligent model family to date, built on a foundation of state-of-the-art reasoning. It is designed to bring any idea to life by mastering agentic workflows, autonomous coding, and complex multimodal tasks. This guide covers key features of the Gemini 3 model family and how to get the most out of it.

Try Gemini 3 Pro for free

Explore our collection of Gemini 3 apps to see how the model handles advanced reasoning, autonomous coding, and complex multimodal tasks.

Get started with few lines of code:

PythonJavaScriptREST

from google import genai

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-pro-preview",
    contents="Find the race condition in this multi-threaded C++ snippet: [code here]",
)

print(response.text)

Meet Gemini 3

Gemini 3 Pro is the first model in the new series. gemini-3-pro-preview is best for your complex tasks that require broad world knowledge and advanced reasoning across modalities.

Model ID Context Window (In / Out) Knowledge Cutoff Pricing (Input / Output)*
gemini-3-pro-preview 1M / 64k Jan 2025 $2 / $12 (<200k tokens)
$4 / $18 (>200k tokens)
gemini-3-pro-image-preview 65k / 32k Jan 2025 $2 (Text Input) / $0.134 (Image Output)**

* Pricing is per 1 million tokens unless otherwise noted. ** Image pricing varies by resolution. See the pricing page for details.

For detailed rate limits, batch pricing, and additional information, see the models page.

New API features in Gemini 3

Gemini 3 introduces new parameters designed to give developers more control over latency, cost, and multimodal fidelity.

Thinking level

Gemini 3 Pro uses dynamic thinking by default to reason through prompts. You can use the thinking_level parameter, which controls the maximum depth of the model’s internal reasoning process before it produces a response. Gemini 3 treats these levels as relative allowances for thinking rather than strict token guarantees.

If thinking_level is not specified, Gemini 3 Pro will default to high. For faster, lower-latency responses when complex reasoning isn’t required, you can constrain the model’s thinking level to low.

  • low: Minimizes latency and cost. Best for simple instruction following, chat, or high-throughput applications

  • medium: Currently not supported

  • high (Default): Maximizes reasoning depth. The model may take significantly longer to reach a first token, but the output will be more carefully reasoned.

PythonJavaScriptREST

from google import genai
from google.genai import types

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-pro-preview",
    contents="How does AI work?",
    config=types.GenerateContentConfig(
        thinking_config=types.ThinkingConfig(thinking_level="low")
    ),
)

print(response.text)

Important: You cannot use both thinking_level and the legacy thinking_budget parameter in the same request. Doing so will return a 400 error.

Media resolution

Gemini 3 introduces granular control over multimodal vision processing via the media_resolution parameter. Higher resolutions improve the model’s ability to read fine text or identify small details, but increase token usage and latency. The media_resolution parameter determines the maximum number of tokens allocated per input image or video frame.

You can now set the resolution to media_resolution_low, media_resolution_medium, or media_resolution_high per individual media part or globally (via generation_config). If unspecified, the model uses optimal defaults based on the media type.

Recommended settings

Media Type Recommended Setting Max Tokens Usage Guidance
Images media_resolution_high 1120 Recommended for most image analysis tasks to ensure maximum quality.
PDFs media_resolution_medium 560 Optimal for document understanding; quality typically saturates at medium. Increasing to high rarely improves OCR results for standard documents.
Video (General) media_resolution_low (or media_resolution_medium) 70 (per frame) Note: For video, low and medium settings are treated identically (70 tokens) to optimize context usage. This is sufficient for most action recognition and description tasks.
Video (Text-heavy) media_resolution_high 280 (per frame) Required only when the use case involves reading dense text (OCR) or small details within video frames.

Note: The media_resolution parameter maps to different token counts depending on the input type. While images scale linearly (media_resolution_low: 280, media_resolution_medium: 560, media_resolution_high: 1120), Video is compressed more aggressively. For Video, both media_resolution_low and media_resolution_medium are capped at 70 tokens per frame, and media_resolution_high is capped at 280 tokens. See full details here

PythonJavaScriptREST

from google import genai
from google.genai import types
import base64

# The media_resolution parameter is currently only available in the v1alpha API version.
client = genai.Client(http_options={'api_version': 'v1alpha'})

response = client.models.generate_content(
    model="gemini-3-pro-preview",
    contents=[
        types.Content(
            parts=[
                types.Part(text="What is in this image?"),
                types.Part(
                    inline_data=types.Blob(
                        mime_type="image/jpeg",
                        data=base64.b64decode("..."),
                    ),
                    media_resolution={"level": "media_resolution_high"}
                )
            ]
        )
    ]
)

print(response.text)

Temperature

For Gemini 3, we strongly recommend keeping the temperature parameter at its default value of 1.0.

While previous models often benefited from tuning temperature to control creativity versus determinism, Gemini 3’s reasoning capabilities are optimized for the default setting. Changing the temperature (setting it below 1.0) may lead to unexpected behavior, such as looping or degraded performance, particularly in complex mathematical or reasoning tasks.

Thought signatures

Gemini 3 uses Thought signatures to maintain reasoning context across API calls. These signatures are encrypted representations of the model’s internal thought process. To ensure the model maintains its reasoning capabilities you must return these signatures back to the model in your request exactly as they were received:

  • Function Calling (Strict): The API enforces strict validation on the “Current Turn”. Missing signatures will result in a 400 error.

  • Text/Chat: Validation is not strictly enforced, but omitting signatures will degrade the model’s reasoning and answer quality.

  • Image generation/editing (Strict): The API enforces strict validation on all Model parts including a thoughtSignature. Missing signatures will result in a 400 error.

Success: If you use the official SDKs (Python, Node, Java) and standard chat history, Thought Signatures are handled automatically. You do not need to manually manage these fields.

Function calling (strict validation)

When Gemini generates a functionCall, it relies on the thoughtSignature to process the tool’s output correctly in the next turn. The “Current Turn” includes all Model (functionCall) and User (functionResponse) steps that occurred since the last standard User text message.

  • Single Function Call: The functionCall part contains a signature. You must return it.

  • Parallel Function Calls: Only the first functionCall part in the list will contain the signature. You must return the parts in the exact order received.

  • Multi-Step (Sequential): If the model calls a tool, receives a result, and calls another tool (within the same turn), both function calls have signatures. You must return all accumulated signatures in the history.

Text and streaming

For standard chat or text generation, the presence of a signature is not guaranteed.

  • Non-Streaming: The final content part of the response may contain a thoughtSignature, though it is not always present. If one is returned, you should send it back to maintain best performance.

  • Streaming: If a signature is generated, it may arrive in a final chunk that contains an empty text part. Ensure your stream parser checks for signatures even if the text field is empty.

Image generation and editing

For gemini-3-pro-image-preview, thought signatures are critical for conversational editing. When you ask the model to modify an image it relies on the thoughtSignature from the previous turn to understand the composition and logic of the original image.

  • Editing: Signatures are guaranteed on the first part after the thoughts of the response (text or inlineData) and on every subsequent inlineData part. You must return all of these signatures to avoid errors.

Code examples

Multi-step Function Calling (Sequential)

Parallel Function Calling

Text/In-Context Reasoning (No Validation)

Image Generation & Editing

Migrating from other models

If you are transferring a conversation trace from another model (e.g., Gemini 2.5) or injecting a custom function call that was not generated by Gemini 3, you will not have a valid signature.

To bypass strict validation in these specific scenarios, populate the field with this specific dummy string: "thoughtSignature": "context_engineering_is_the_way_to_go"

Structured Outputs with tools

Gemini 3 allows you to combine Structured Outputs with built-in tools, including Grounding with Google Search, URL Context, and Code Execution.

PythonJavaScriptREST

from google import genai
from google.genai import types
from pydantic import BaseModel, Field
from typing import List

class MatchResult(BaseModel):
    winner: str = Field(description="The name of the winner.")
    final_match_score: str = Field(description="The final match score.")
    scorers: List[str] = Field(description="The name of the scorer.")

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-pro-preview",
    contents="Search for all details for the latest Euro.",
    config={
        "tools": [
            {"google_search": {}},
            {"url_context": {}}
        ],
        "response_mime_type": "application/json",
        "response_json_schema": MatchResult.model_json_schema(),
    },  
)

result = MatchResult.model_validate_json(response.text)
print(result)

Image generation

Gemini 3 Pro Image lets you generate and edit images from text prompts. It uses reasoning to “think” through a prompt and can retrieve real-time data—such as weather forecasts or stock charts—before using Google Search grounding before generating high-fidelity images.

New & improved capabilities:

  • 4K & text rendering: Generate sharp, legible text and diagrams with up to 2K and 4K resolutions.

  • Grounded generation: Use the google_search tool to verify facts and generate imagery based on real-world information.

  • Conversational editing: Multi-turn image editing by simply asking for changes (e.g., “Make the background a sunset”). This workflow relies on Thought Signatures to preserve visual context between turns.

For complete details on aspect ratios, editing workflows, and configuration options, see the Image Generation guide.

PythonJavaScriptREST

from google import genai
from google.genai import types

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="Generate an infographic of the current weather in Tokyo.",
    config=types.GenerateContentConfig(
        tools=[{"google_search": {}}],
        image_config=types.ImageConfig(
            aspect_ratio="16:9",
            image_size="4K"
        )
    )
)

image_parts = [part for part in response.parts if part.inline_data]

if image_parts:
    image = image_parts[0].as_image()
    image.save('weather_tokyo.png')
    image.show()

Example Response

Migrating from Gemini 2.5

Gemini 3 is our most capable model family to date and offers a stepwise improvement over Gemini 2.5 Pro. When migrating, consider the following:

  • Thinking: If you were previously using complex prompt engineering (like Chain-of-thought) to force Gemini 2.5 to reason, try Gemini 3 with thinking_level: "high" and simplified prompts.

  • Temperature settings: If your existing code explicitly sets temperature (especially to low values for deterministic outputs), we recommend removing this parameter and using the Gemini 3 default of 1.0 to avoid potential looping issues or performance degradation on complex tasks.

  • PDF & document understanding: Default OCR resolution for PDFs has changed. If you relied on specific behavior for dense document parsing, test the new media_resolution_high setting to ensure continued accuracy.

  • Token consumption: Migrating to Gemini 3 Pro defaults may increase token usage for PDFs but decrease token usage for video. If requests now exceed the context window due to higher default resolutions, we recommend explicitly reducing the media resolution.

  • Image segmentation: Image segmentation capabilities (returning pixel-level masks for objects) are not supported in Gemini 3 Pro. For workloads requiring native image segmentation, we recommend continuing to utilize Gemini 2.5 Flash with thinking turned off or Gemini Robotics-ER 1.5.

OpenAI compatibility

For users utilizing the OpenAI compatibility layer, standard parameters are automatically mapped to Gemini equivalents:

  • reasoning_effort (OAI) maps to thinking_level (Gemini). Note that reasoning_effort medium maps to thinking_level high.

Prompting best practices

Gemini 3 is a reasoning model, which changes how you should prompt.

  • Precise instructions: Be concise in your input prompts. Gemini 3 responds best to direct, clear instructions. It may over-analyze verbose or overly complex prompt engineering techniques used for older models.

  • Output verbosity: By default, Gemini 3 is less verbose and prefers providing direct, efficient answers. If your use case requires a more conversational or “chatty” persona, you must explicitly steer the model in the prompt (e.g., “Explain this as a friendly, talkative assistant”).

  • Context management: When working with large datasets (e.g., entire books, codebases, or long videos), place your specific instructions or questions at the end of the prompt, after the data context. Anchor the model’s reasoning to the provided data by starting your question with a phrase like, “Based on the information above…”.

Learn more about prompt design strategies in the prompt engineering guide.

FAQ

  1. What is the knowledge cutoff for Gemini 3 Pro? Gemini 3 has a knowledge cutoff of January 2025. For more recent information, use the Search Grounding tool.

  2. What are the context window limits? Gemini 3 Pro supports a 1 million token input context window and up to 64k tokens of output.

  3. Is there a free tier for Gemini 3 Pro? You can try the model for free in Google AI Studio, but currently, there is no free tier available for gemini-3-pro-preview in the Gemini API.

  4. Will my old thinking_budget code still work? Yes, thinking_budget is still supported for backward compatibility, but we recommend migrating to thinking_level for more predictable performance. Do not use both in the same request.

  5. Does Gemini 3 support the Batch API? Yes, Gemini 3 supports the Batch API.

  6. Is Context Caching supported? Yes, Context Caching is supported for Gemini 3. The minimum token count required to initiate caching is 2,048 tokens.

  7. Which tools are supported in Gemini 3? Gemini 3 supports Google Search, File Search, Code Execution, and URL Context. It also supports standard Function Calling for your own custom tools. Please note that Grounding with Google Maps and Computer Use are currently not supported.

Next steps

Was this helpful?

Send feedback

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-11-29 UTC.

English
)

For that matter I’’m in Salem Oregon this is my actual weather card. Anyways nothing there nor on the models on the AI studios say any quota rates. I have NO idea what they are. It needs to be clear what the usage rates like how many messages a day/hour or something.