Analyzing an MP4 fails in research

So I made an App to generate lyrics to feed to Suno. Suno then makes a sing as MP3 and MP4, and this topic is purely related to Gemini’s ability to analyze an MP4 file!

So, I used the “Deep Research” option in Gemini to analyze the lyrics I’ve generated for this song. And POOF, I get a nice report detailing the strong and weak points in it. Nice!

But I also want “Deep Research” to analyze the sound file (MP4), and it tells me it cannot do this. #Seriously? Anyways, this is just a minor annoyance…

However, if I just use Gemini 2.5 Flash without the “Deep Research” with just the MP4 file, I receive a nice analysis including the lyrics. So Gemini can analyze MP4 files! It even gives an opinion about the song, like “The song “Wagenburg Elegie” is a highly effective and compelling piece of dramatic storytelling and music.” and it also noticed that it was created by SUNO, an AI music engine. And it is interesting as I did not give Gemini the lyrics itself, so it extracted them from the MP4 in some way.


But if Gemini can analyze an MP4 file, then why can’t “Deep Research” do the same? It’s the same AI engine, isn’t it?

Hello,

Thank you for your question. To clarify, Deep Research functions as a research assistant that creates and executes workflows, rather than solely as a large language model. It performs agentic planning by generating a multi-point research plan from your prompt, which you can then review and edit. In addition to using Google Search, it can securely connect to and draw context from your Google Drive (Docs, Sheets, Slides, PDFs), Gmail, and Google Chat.

Regarding the behavior you are observing, while Gemini Flash can analyze video files (.mp4), it is possible that the workflow generated by Deep Research does not allow the underlying Gemini API to access the video, depending on the specific prompt and content.

To help us analyze the issue further, it would be very helpful if you could provide the exact prompts, content, and configurations used.

1 Like