Would the Gemini API through OpenAI SDK support file URI such as images, audio, video, and nontextual PDFs?

zavocc306 · November 13, 2024, 12:54am

I’ve noticed I can just use the OpenAI SDK to use Gemini 1.5 models and it’s an easy to use but what I’ve seen in the documentation is the multimodal support is limited, will the OAI sdk version of Gemini API supports these modalities such as files from public URLs or base64 input maybe in the future if not implemented today?

Jie_Zhou · March 27, 2025, 9:55am

I also really need this image_url function, but Gemini OpenAI compatible API now only supports passing the base64 image data, which is really not cool!

jkirstaetter · March 27, 2025, 10:08am

Hi,

Not at the moment, as explored and answered here: Combining OpenAI-Compatible Gemini Completions with File Uploads - #4 by Will_Powell

The type is not accepted by the Gemini API yet.

Cheers.

jkirstaetter · March 27, 2025, 10:10am

Hi @Jie_Zhou

HOW do you pass in the base64 encoded file into the Chat completions?
I’m genuinely interested to know more about that. Both - base64 and URI - would require the content type file which the OpenAI comp endpoints of the Gemini API do not accept.

What am I missing?
What’s your magic sorcery?

Cheers

Topic		Replies	Views
Combining OpenAI-Compatible Gemini Completions with File Uploads Gemini API models , openai_compatibility	2	118	March 27, 2025
Document/Files understading in Gemini with OpenAI SDK Gemini API learning , documentation	1	39	April 23, 2025
File API support in openai compatible mode Gemini API openai_compatibility	2	40	April 3, 2025
OpenAI compatibility for pdf file Gemini API api , openai_compatibility	4	140	April 10, 2025
500 error when including a file Gemini API api , model	12	207	September 17, 2024

Would the Gemini API through OpenAI SDK support file URI such as images, audio, video, and nontextual PDFs?

Related topics