File API - Return Images embedded in text

robo · April 1, 2025, 12:30pm

I am uploading a file and extracting data from it to be used later.
Now, images are not returned.
Any idea how I could still also extract images?

Example:
PDF that is a course book on AI.
I want to not only extract text but also figures. Is it possible that Gemini returns images too? Maybe through urls?

How would you go about this challenge?

chunduriv · June 23, 2025, 7:13pm

Hi @robo,

Sorry for the delay in response. The Gemini File API does not support image extraction from PDFs Instead you can programmatically extract images using a Python library like pypdf and write your prompt in such a way that your text will point out the image with the reference (e.g., Fig 1.7) to get information about them (e.g., descriptions, object recognition, text extraction from the image).

Thank you!

Topic		Replies	Views
Token counts for image processing inside PDF documents Gemini API api , gemini	1	65	December 30, 2025
OpenAI compatibility for pdf file Gemini API api , openai_compatibility	6	552	June 19, 2025
Gemini not able to read text from text/pdf file uploading Gemini API gemini , ai	6	1382	November 7, 2024
Would the Gemini API through OpenAI SDK support file URI such as images, audio, video, and nontextual PDFs? Gemini API gemini-15 , api	3	245	March 27, 2025
Gemini API with PDF file Gemini API api , help_request	1	107	May 29, 2025

File API - Return Images embedded in text

Related topics