Gemini Flash Multimodal release

I’m planning a pilot, can you may be give me an idea about when multimodal is going to be released? Not an exact date, but rather will it be weeks? months? half a year?
Many thanks!

Hi @IKGN

Not sure what you are asking for but multimodal response is already available. At least in Gemini 2.0 Flash models.

See here: Generate images  |  Gemini API  |  Google AI for Developers

Cheers.

the text and image generation output has been released for experimental use. try it out via google ai studio. just choose gemini 2.0 flash experimental

I meant multimodal live API, sorry, voice to voice. Multimodal Live API  |  Gemini API  |  Google AI for Developers That’s still experimental and restrictions apply (3 concurrent sessions per API key etc.). I’m trying it out since 3 months already and would like to scale a bit more.