Can I know which documents were referenced in the "Prompt with multiple documents"?

Hi everyone.

Is it possible to know which files were referenced when getting a response from the code at the above URL?

You can try crafting your prompt and ask it to reference each document, but there is no guarantee. Some possible things to keep in mind:

  • By the time everything gets to the LLM, the documents are already tokens mixed in with the rest of the prompt. So saying “tell me which image the answers are from” won’t be very useful.
  • Instead, you may need to add text prompting before each image or document.
    • Say something like “The document that follows this has reference ID Document A:” and then include that document.
    • With this, your prompt may need something like. “Identify each document by the reference ID. Any information you provide should include the reference ID of where you got that answer.”
  • The code shown there shows text and then a bunch of image parts. You will need to mix text parts in between the image parts when you do this.
  • Prompting is still an art, not a science. Experiment.
  • Finally, you may wish to look into RAG solutions, many of which are better at providing more concrete referenced results.
1 Like

The llink given just shows the method of passing multiple documents.

Immediately before, we see the Pillow Image Library being used to load an object from a PDF file:

sample_file_2 = PIL.Image.open('example-1.pdf')
sample_file_3 = PIL.Image.open('example-2.pdf')

However the example code says “and three documents previously uploaded”

The example for sample_file 1 is the file name of an Arxif paper by Google that has been uploaded to storage so the file link can be provided. But it is merely meant as an example. You can upload mutliple files so that sample_file_2 and more refers to your own files that you wish to query (each consuming indeterminate tokens of context window).

There is no specific file being shown. You can upload your own documents about raising pigeons. Then include in the request like shown as a list along with the prompt string. What parts of them can be understood will be loaded into the AI model’s context window so the prompt can ask a question about them.


As far as your question about “referenced” being “where did the answer come from”…

If the AI is not passed the file name, it cannot answer questions about the file name. One technique (that is more useful for images) is to alternate images with more text prompt that says “next image file name xxx” and then the binary.

More deeply: The AI still cannot accurately answer “which document” after it has produced the text of an answer, even in the same response, because there is no internal memory or thoughts in a transformer language model. It only generates language one word at a time, and answering “also tell me which document provided the answer” will be from the AI’s later contemplation of the answer provided against documents again.