OpenAI compatibility + multimodal?

Javier_De_Pedro_Lope · March 24, 2025, 12:37pm

I wanna adapt my current REST requests with multimodal gemini models (e.g gemini-2.0-flash-exp-image-generation) with the OpenAI compatibility endpoint, but then of course I’m not able to pass the “responseModalities” parameter anymore (not supported in the OpenAI library) in order to make the responses multimodal (in my case, both text or/and image).

Is there any workaround I am missing, or any plans for the future for integrating genuine Gemini API parameters with the OpenAI compatibility REST library?

Thanks

Mrinal_Ghosh · June 9, 2025, 10:19am

Hi @Javier_De_Pedro_Lope ,

Welcome to the forum !

Gemini API provides OpenAI REST library compatibility.Please refer to the OpenAI compatibility | Gemini API | Google AI for Developers.

Thank you !!

Krish_Varnakavi1 · June 24, 2025, 6:42pm

Hi @Javier_De_Pedro_Lope,

You are right.. The OpenAI compatibility endpoint currently does not support the responseModalities parameter.

Here are some work arounds:
You can make separate API calls—one for text responses via the OpenAI-compatible endpoint and another to the Gemini API for image generation. This approach allows handling multimodal outputs separately. Alternatively, you can create a custom integration that mimics the responseModalities functionality by processing and combining text and image outputs from distinct API calls.

Please monitor updates from the Gemini API team, as they may include multimodal support in future versions of the OpenAI compatibility endpoint. I will escalate this as a feature request to the concerned team as well.

Please keep an eye on Release doc and Open AI doc as mentioned by @Mrinal_Ghosh for future updates.

Topic		Replies	Views
Combining OpenAI-Compatible Gemini Completions with File Uploads Gemini API models , openai_compatibility	2	218	March 27, 2025
Will gemini-2.0-flash-preview-image-generation be available on OpenAI compatible API v1main? Gemini API openai_compatibility , gemin-flash-image	3	162	May 26, 2025
When will the API support responding with voice or images? Gemini API model , gemini-flash	3	128	April 3, 2025
OpenAI compatibility not working anymore Gemini API gemini-15 , api	5	736	November 19, 2024
Settings generation_config response_modalities with IMAGE Gemini API gemini-flash	2	143	March 17, 2025

OpenAI compatibility + multimodal?

Related topics