Invalid Responses with gemini-1.5-flash-002 in Document Classification

I am using the gemini-1.5-flash-002 model for document classification tasks. The workflow involves providing a nested list and a document, with the prompt explicitly asking the model to pick an item only from the list.

However, I am occasionally receiving responses that are outside the provided list. These invalid responses seem to be items from previous requests, which indicates the possibility of context from prior interactions influencing the current output.

Problem:

  • There doesn’t seem to be an option to completely reset the context to ensure responses are strictly based on the current input.
  • I cannot use context caching as my requests do not meet the minimum token requirement.

Feature Request or Query:

  • Is there a reset flag or similar feature to ensure the model doesn’t carry over context from previous interactions?

Steps to Reproduce:

  1. Use gemini-1.5-flash-002 for document classification with a nested list and a document as input.
  2. Ask the model to select an item only from the nested list.
  3. Observe occasional invalid responses referencing items from prior requests.

Expected Behavior:
The model should only return items from the provided list in the current request.

Actual Behavior:
Responses sometimes include items that were present in the nested lists of previous requests but not the current one.

Environment:

  • Model: gemini-1.5-flash-002
  • Use Case: Document Classification

Let me know if further clarification or example inputs/outputs are needed.

1 Like

Hi @Nimish_Gupta ,

To constrain model output to a certain category or class, you can use something like Enum. Here is a sample colab gist link for document classification using Enum.

You can also refer the Enum Quickstart tutorial here.

The API is stateless. There is no memory from request to request. Therefore, it would make absolutely no sense to have a reset() operation to clear out state; there is no state to clear.

The only plausible explanation for the behavior you are describing is that your nested list contains some common items that the model likes to add to the output. That would mean that you observe common hallucinations. In any case, using an enum to constrain the choices should work.

Hey, I tried using enum to restrict the values, but start seeing 400 error. Looks like we have some sort of limit around number of enums for a field. There is already an issue for this.

In the prompt, I have explicitly mentioned to not give any output outside of the predefined list. But still getting some hallucinations.
Also i have turned on code execution, to check via code if we are getting any values out side of the list, but that code is also sending incorrect results.

Telling a large language model what not to do is called negative prompting. There is extensive research on it that shows it doesn’t work well. Counterintuitively, telling a model what not to do might result in the model doing more of it.

Positive prompting works much better in general. For example, you give the model sample inputs and what the expected output is, several of them, all in the prompt. That guides the model behavior toward doing the next one (the last one, the one you really want the result from) just as the ones before. Hopefully that will work better in your use case.

In any case, it’s impossible for a stateless protocol to remember items from a previous request, so that part remains a puzzle.

Thanks for this, didn’t knew that its called Negative prompting. I will try to change the prompt.

I am doing this classification on bunch of documents(pdfs), that’s why I can’t give example upfront.

Yeah still figuring out why it is giving me results from the old request’s list instead of the current list.