How to Get Gemini to Reference File Name

Hi there,

I am testing Gemini 1.5 Flash 002, I have asked to summarize several videos, and I am struggling to get it to reference the files using their file names.

I have included the following in my prompt:

Note: Consistently use the original video file names provided to you (e.g., tutorial_1.mp4, tutorial_2.mp4) in all references.

However, Gemini is not using original file names, instead, it is generating new names.

I would really appreciate some help.

Thanks!

Update: Also, it seems that it is only analyzing a single video instead of 3!

Welcome to the forum. By the time the information you provide in the prompt arrives at the model (any LLM, not just Gemini), it has been converted to a linear sequence of tokens. The model doesn’t see a file boundary.

You can inform the model which parts are which file by saying the file name (as a string) in the prompt just before the file itself. Simply put, your prompt has verbiage like this
The following image is labeled file myfirstfile.jpg:
(Insert Part that contains the bytes of file myfirstfile.jpg)
The following image is labeled file mysecondfile.jpg:
(Insert Part that contains the bytes of file mysecondfile.jpg)
Continue your prompt, probably asking questions pertaining to the images in the files

To support this structure, the API has Part: Caching  |  Gemini API  |  Google AI for Developers

Hope that helps.

Hi, @OrangiaNebula, thank you very much for taking the time to answer my question. I don’t think your suggestion would work for the Google AI Studio, because the media and prompt are separated.

I also don’t understand what you mean by “Insert Part that contains the bytes of file myfirstfile.jpg”

I am going to some more research and see if your suggestion make more sense when making requests via API instead of Google AI Studio.

Thanks again!

The suggestion is applicable when using the API. AI Studio doesn’t let you compose a “sandwich” of text prompt part, media part, text prompt part, media part, … in a single prompt; the UI makes all the text go to the same spot.

That really just means the Part are listed one after the other in a list (for purists, a Iterable[protos.Part]).

To get Gemini (like OpenAI’s Gemini or similar AI models) to reference a specific file name:

  1. Provide Context: Explicitly mention the file name in your input to the AI.
  2. Use Prompts: Guide the AI to reference the file by framing prompts like, “Refer to the file named ‘example.txt’.”
  3. Structured Input: Use structured input formats that clearly specify filenames and their content.
  4. API Parameters: If using an API, check if it allows specifying file references directly.
  5. Post-processing: After AI output, use a script to append or validate file names for consistency.