How to Get Gemini to Reference File Name

thealchemist · October 18, 2024, 4:50am

Hi there,

I am testing Gemini 1.5 Flash 002, I have asked to summarize several videos, and I am struggling to get it to reference the files using their file names.

I have included the following in my prompt:

Note: Consistently use the original video file names provided to you (e.g., tutorial_1.mp4, tutorial_2.mp4) in all references.

However, Gemini is not using original file names, instead, it is generating new names.

I would really appreciate some help.

Thanks!

Update: Also, it seems that it is only analyzing a single video instead of 3!

OrangiaNebula · October 18, 2024, 8:33am

Welcome to the forum. By the time the information you provide in the prompt arrives at the model (any LLM, not just Gemini), it has been converted to a linear sequence of tokens. The model doesn’t see a file boundary.

You can inform the model which parts are which file by saying the file name (as a string) in the prompt just before the file itself. Simply put, your prompt has verbiage like this
The following image is labeled file myfirstfile.jpg:
(Insert Part that contains the bytes of file myfirstfile.jpg)
The following image is labeled file mysecondfile.jpg:
(Insert Part that contains the bytes of file mysecondfile.jpg)
Continue your prompt, probably asking questions pertaining to the images in the files

To support this structure, the API has Part: Caching | Gemini API | Google AI for Developers

Hope that helps.

thealchemist · October 18, 2024, 3:06pm

Hi, @OrangiaNebula, thank you very much for taking the time to answer my question. I don’t think your suggestion would work for the Google AI Studio, because the media and prompt are separated.

I also don’t understand what you mean by “Insert Part that contains the bytes of file myfirstfile.jpg”

I am going to some more research and see if your suggestion make more sense when making requests via API instead of Google AI Studio.

Thanks again!

OrangiaNebula · October 18, 2024, 5:07pm

The suggestion is applicable when using the API. AI Studio doesn’t let you compose a “sandwich” of text prompt part, media part, text prompt part, media part, … in a single prompt; the UI makes all the text go to the same spot.

That really just means the Part are listed one after the other in a list (for purists, a Iterable[protos.Part]).

TG_Link_Hub · October 21, 2024, 3:15am

To get Gemini (like OpenAI’s Gemini or similar AI models) to reference a specific file name:

Provide Context: Explicitly mention the file name in your input to the AI.
Use Prompts: Guide the AI to reference the file by framing prompts like, “Refer to the file named ‘example.txt’.”
Structured Input: Use structured input formats that clearly specify filenames and their content.
API Parameters: If using an API, check if it allows specifying file references directly.
Post-processing: After AI output, use a script to append or validate file names for consistency.

Topic		Replies	Views
Getting Youtube video summary via Gemini AI API Gemini API api	3	804	March 31, 2025
Can I know which documents were referenced in the "Prompt with multiple documents"? Gemini API gemini-15 , api , models	2	115	August 29, 2024
Instructions are being ignored today Gemini API api , open-ai , gemini-flash-2-5	9	167	June 24, 2025
API periodically ignoring multiple documents Gemini API gemini-15 , api , gemini-api	9	236	October 1, 2024
How to get multi-part responses? Gemini API gemini-15 , api , gemini-api	8	591	November 27, 2024

How to Get Gemini to Reference File Name

Related topics