TPM limit on free tier

reid · June 29, 2025, 10:11am

Input TPM limit on the models(except 2.0 Flash, which is 1M) is 250K on the free tier.

Does that mean that context length via the API on the free tier can never reach more than 250K? Does history in the request count towards TPM quota?

reid · June 29, 2025, 12:30pm

Further testing showed that I can go over 250K context and not triggering Input TPM limit.
How is TPM limit calculated, then? I’m really confused.

Lalit_Kumar · June 30, 2025, 8:57am

Hello,

As you mentioned the free tier limit is 250k tokens per minute as specified in rate limit documentation. You should get an error once this limit exceeds.

reid · June 30, 2025, 9:54am

Yes, thanks, I’ve read the documentation.
Although, I can process well over 250K with 2.5 models on the free key(which, according to docs, are 250K TPM) in a single request.
Clearly, that means that either:

Docs are wrong;
I am wrong and clearly misunterstood the TPM quota.

Screenshot shows that the video(the museum example from AI studio) well exceeds 250K and was prompted in a single request.

Lalit_Kumar · July 1, 2025, 6:22am

Would you mind using the token counting methods below to compare the results? I see you’re already using method 2.

Method 1:

    client.models.count_tokens(
        model="model_name", contents=[your_content]
    )

Method 2:

response = client.models.generate_content(
    model="model_name", contents=[your_content]
)
print(response.usage_metadata)

reid · July 1, 2025, 7:02am

Sure.
Just as before, 30m example video from AI studio was used(American Museum of Natural History Tour - 30m - Google for Developers (360p, h264).mp4).

Explore a natural history museum: towering dinosaur skeletons, detailed animal dioramas, and diverse exhibits on evolution and geology.

Usage metadata: cache_tokens_details=None cached_content_token_count=None candidates_token_count=23 candidates_tokens_details=None prompt_token_count=531016 prompt_tokens_details=[ModalityTokenCount( modality=<MediaModality.TEXT: 'TEXT'>, token_count=16 ), ModalityTokenCount( modality=<MediaModality.VIDEO: 'VIDEO'>, token_count=473400 ), ModalityTokenCount( modality=<MediaModality.AUDIO: 'AUDIO'>, token_count=57600 )] thoughts_token_count=304 tool_use_prompt_token_count=None tool_use_prompt_tokens_details=None total_token_count=531343 traffic_type=None

Tokens counting: total_tokens=531016 cached_content_token_count=None

Code used:

# Was ran in a colab notebook.
# %pip install -U -q "google-genai>=1.16.0"
# !wget [link truncated] -O huge1.mp4 -q
from google import genai
from google.genai import types
from IPython.display import Markdown
import time
from google.colab import userdata

GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')
MODEL_ID = "gemini-2.5-flash"
client = genai.Client(api_key=GOOGLE_API_KEY)

def upload_video(video_file_name):
  video_file = client.files.upload(file=video_file_name)

  while video_file.state == "PROCESSING":
      print('Waiting for video to be processed.')
      time.sleep(10)
      video_file = client.files.get(name=video_file.name)

  if video_file.state == "FAILED":
    raise ValueError(video_file.state)
  print('Video processing complete: ' + video_file.uri)

  return video_file

huge_video = upload_video('huge1.mp4')

prompt = "Summarize this video, be short, up to 20 words." 

video = huge_video

response = client.models.generate_content(
    model=MODEL_ID,
    contents=[
        video,
        prompt,
    ]
)
tokens = client.models.count_tokens(model=MODEL_ID, contents=[video,prompt])
Markdown(f"{response.text}\n\nUsage metadata: {response.usage_metadata}\n\n Tokens counting: {tokens}")

Lalit_Kumar · July 2, 2025, 5:47am

Hello,

We reproduced your code and had the same observations as you, we will discuss this with our internal team and get back to you with more information.
Thank you for your patience.

Topic		Replies	Views
Over 300k context tokens lead to 429 error for both Gemini-pro and flash on free tier Gemini API api , gemini-flash , gemini-2-5	4	180	August 25, 2025
"finishReason" : "MAX_TOKENS" - But Text is Empty Gemini API prompt , rate-limits	12	1676	July 18, 2025
429 Errors on Large Prompt Gemini API	8	376	August 4, 2024
Hitting quota limit suddenly, have payment methods and everything setup - maybe I'm stupid, I wouldn't know Gemini API ai-studio , billing	15	603	August 21, 2025
Failing to use the API (2.5 pro) - Why Google needs to overcomplicate things? Gemini API api	1	222	June 17, 2025

TPM limit on free tier

Related topics