Can Gemini Analyze from Voices and Videos?
Welcome to the forum.
Yes, the 1.5 models are multimodal. To help you get started, there are examples to try out here: Prompting with media files | Gemini API | Google for Developers (other languages besides Python are also available).
Yes! thank you.
I have another question: if the file is PDF, PPTX, DOCX or other type of files, how can upload them.
The function of (upload_files) does not support docx,pptx, pdf and other types, it is for just Videos,Images,Audios