Understad token count

Hi, I’m experiencing a significant discrepancy between token counting in Google AI Studio versus the Gemini API when processing PDF files.
When processing the same PDF file in Google AI Studio, I observe an input token count of approximately 700-800 tokens (total is around 2000 tokens).
When processing the identical file using the Gemini API with the exact code generated by Google AI Studio, the input token count is around 1800-2000 tokens (total is 3100 tokens more or less).
Is there any optimization that I must do? This significantly impacts our token usage estimates and costs.
Thanks.

def super_simplified_local_approach():
    import os
    from google import genai
    from google.genai import types

    client = genai.Client(
        api_key=os.environ.get("GEMINI_API_KEY"),
    )

    files = [
        client.files.upload(file="media_test/5ee44681-df51-47e7-9f70-520118f2d568.pdf"),
    ]

    try:
        model = "gemini-2.0-flash"
        contents = [
            types.Content(
                role="user",
                parts=[
                    types.Part.from_uri(
                        file_uri=files[0].uri,
                        mime_type=files[0].mime_type,
                    ),
                    types.Part.from_text(
                        text="""pdf"""
                    ),
                ],
            ),
        ]
        generate_content_config = types.GenerateContentConfig(
            temperature=1,
            top_p=0.95,
            top_k=40,
            max_output_tokens=8192,
            response_mime_type="text/plain",
            system_instruction=[
                types.Part.from_text(
                    text="""MY_PROMPT"""
                ),
            ],
        )

        response = client.models.generate_content(
            model=model,
            contents=contents,
            config=generate_content_config,
        )
        print(f"Actual token usage: {response.usage_metadata}")
        return

    except Exception as e:
        print(f"Error: {str(e)}")
        raise
2 Likes

Hi @L_T, In the code which you provided the token count is calculated on the response generated from the model not the input token. To get the input token count please try to use client.models.count_tokens

files = [client.files.upload(file="/content/test.txt")]

response = client.models.count_tokens(
    model="gemini-2.0-flash",
    contents=[files],
)
print(response.total_tokens)

which will give the token count of the files uploaded. Thank You

Well with my code if I use response.usage_metadata.prompt_token_count I get the input token count, which is 1800/2000 (AI studio is700-800), while the output token count is the same for gemini API and AI studio.
If you want to replicate my code I can send the file and prompt I’m using.

Hi @L_T, If possible you can share those to replicate the issue. Thank You.

Yes, in this folder there are both the prompt and the file I’m using “Modified by moderator”
I know the prompt it’s quite lengthy but now I’m focusing on the difference between the Api and AI Studio.

Thanks.