Token counts for image processing inside PDF documents

Panteley_Shmelev · December 17, 2025, 6:52am

(Context) I am building a RAG system where I ingest PDF documents containing both text and images. My goal is to convert these PDFs into markdown and use Gemini to explain/describe the images embedded within the documents.

(Question) I need clarification on how the Gemini API counts input tokens for these PDFs, specifically regarding the images:

Tokenization Method: When I send a PDF to the API, are the images converted into a base64 text string first and tokenized as characters (which would be huge)? Or are they processed as native image tokens (visual embeddings)?
Quota Limits: I know a single high-res image can be 1,000,000+ characters when base64 encoded. If the API treats this as text, I would instantly hit the token limit. However, the documentation mentions a 3,000-image limit per prompt. Does the token count for images function separately from the 1M text token context window?

Srikanta_K_N · December 30, 2025, 6:36am

Hi @Panteley_Shmelev, welcome to the community!

Apologies for the delayed response.

Gemini models process documents in PDF format using native vision to understand entire document contexts. Document Processing
Images and PDFs share the same context window as text, but they use a fixed token cost per page or image that is independent of the file size in bytes. Tokens

So, if your PDF is large, use the File API as it supports up to 2GB. Using File API

You can also use the media_resolution parameter to control costs. Setting it to LOW can reduce the image’s cost. Media Resolution

Thank you!

Topic		Replies	Views
Gemini API large PDF file upload limited tokens? Gemini API api , prompt	2	473	June 2, 2026
Gemini vision pricing Gemini API api , vision , python	4	774	October 6, 2024
Handling Multiple PDF Files with Gemini API and Token Limit Issues Gemini API ai-studio , api , models	3	960	January 9, 2025
OpenAI compatibility for pdf file Gemini API api , openai_compatibility	6	639	June 19, 2025
Token Estimation for Processing PDFs Gemini API help-request , text	1	155	April 14, 2025

Token counts for image processing inside PDF documents

Related topics