Upload token count and input token count are not same

Anshul_Kumar · October 24, 2025, 6:51pm

When I upload a PDF file (~33k tokens per count_tokens()), generate_content() fails with an INVALID_ARGUMENT error claiming the input token count exceeds the model limit (~1.2M tokens).
It seems the SDK is serializing or expanding the uploaded file differently between count_tokens() and generate_content().

2025-10-25 00:09:21,699 - httpx - INFO - HTTP Request: POST https://generativelanguage.googleapis.com/upload/v1beta/files “HTTP/1.1 200 OK”
2025-10-25 00:09:24,932 - httpx - INFO - HTTP Request: POST “HTTP/1.1 200 OK”
2025-10-25 00:09:27,655 - httpx - INFO - HTTP Request: POST “HTTP/1.1 200 OK”
Uploaded file tokens: total_tokens=33541 cached_content_token_count=None
2025-10-25 00:09:27,658 - google_genai.models - INFO - AFC is enabled with max remote calls: 10.
2025-10-25 00:10:03,628 - httpx - INFO - HTTP Request: POST “HTTP/1.1 400 Bad Request”
Traceback (most recent call last):
File “/Users/anshulkumar/backfin/tet.py”, line 22, in
response = client.models.generate_content(model=“gemini-2.5-flash-lite”,
contents=[uploaded_files,prompt])
File “/Users/anshulkumar/backfin/.venv/lib/python3.13/site-packages/google/genai/models.py”, line 5202, in generate_content
response = self._generate_content(
model=model, contents=contents, config=config
)
File “/Users/anshulkumar/backfin/.venv/lib/python3.13/site-packages/google/genai/models.py”, line 4178, in _generate_content
response_dict = self._api_client.request(
‘post’, path, request_dict, http_options
)
File “/Users/anshulkumar/backfin/.venv/lib/python3.13/site-packages/google/genai/_api_client.py”, line 755, in request
response = self._request(http_request, stream=False)
File “/Users/anshulkumar/backfin/.venv/lib/python3.13/site-packages/google/genai/_api_client.py”, line 684, in _request
errors.APIError.raise_for_response(response)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^
File “/Users/anshulkumar/backfin/.venv/lib/python3.13/site-packages/google/genai/errors.py”, line 101, in raise_for_response
raise ClientError(status_code, response_json, response)
google.genai.errors.ClientError: 400 INVALID_ARGUMENT. {‘error’: {‘code’: 400, ‘message’: ‘The input token count exceeds the maximum number of tokens allowed 1237083.’, ‘status’: ‘INVALID_ARGUMENT’}}

Mahesh_Sutar · December 22, 2025, 9:46am

Hello,

Hello! Welcome to the forum!!!

both are handling PDF are different way, It looks like generate_content() is processing the full Text, Structure, and Layout of the PDF, which adds a massive amount of tokens. count_tokens() likely isn’t accounting for that deep structural data, hence the lower estimate.

For context, here are the official docs:

Understand and Count Tokens Confirms that count_tokens is an estimate, while generate_content reports actual consumption including all overhead.

Document Understanding Guide Details how PDFs are parsed via native vision to extract structure and layout.

Gemini API Developer Guide Explains how parameters like media_resolution affect token usage for capturing fine details.

Hope this helps!

Topic		Replies	Views
Understad token count Gemini API api , prompt	4	270	February 27, 2025
Understanding Token Counts Gemini API models , prompt	2	171	February 26, 2025
Pricing inconsistencies with Documents/Images Gemini API models , vertexai	4	242	October 13, 2024
Gemini API large PDF file upload limited tokens? Gemini API api , prompt	1	449	March 7, 2025
Token counts mismatch - 9x discrepancy! Gemini API bug , api	9	722	April 17, 2025

Upload token count and input token count are not same

Related topics