Hi, I’m experiencing a significant discrepancy between token counting in Google AI Studio versus the Gemini API when processing PDF files.
When processing the same PDF file in Google AI Studio, I observe an input token count of approximately 700-800 tokens (total is around 2000 tokens).
When processing the identical file using the Gemini API with the exact code generated by Google AI Studio, the input token count is around 1800-2000 tokens (total is 3100 tokens more or less).
Is there any optimization that I must do? This significantly impacts our token usage estimates and costs.
Thanks.
def super_simplified_local_approach():
import os
from google import genai
from google.genai import types
client = genai.Client(
api_key=os.environ.get("GEMINI_API_KEY"),
)
files = [
client.files.upload(file="media_test/5ee44681-df51-47e7-9f70-520118f2d568.pdf"),
]
try:
model = "gemini-2.0-flash"
contents = [
types.Content(
role="user",
parts=[
types.Part.from_uri(
file_uri=files[0].uri,
mime_type=files[0].mime_type,
),
types.Part.from_text(
text="""pdf"""
),
],
),
]
generate_content_config = types.GenerateContentConfig(
temperature=1,
top_p=0.95,
top_k=40,
max_output_tokens=8192,
response_mime_type="text/plain",
system_instruction=[
types.Part.from_text(
text="""MY_PROMPT"""
),
],
)
response = client.models.generate_content(
model=model,
contents=contents,
config=generate_content_config,
)
print(f"Actual token usage: {response.usage_metadata}")
return
except Exception as e:
print(f"Error: {str(e)}")
raise