I am evaluating the Gemini API for the following task: Given a number of text excerpts I would like the API to analyze each excerpt and then produce a response that contains the results of each excerpt analysis, one result per excerpt. For example, if I supply 100 texts: “Text1”, “Text2”, …, “Text100”, I want back a list of 100 results, one result for each text.
I supply my instructions in the system_instruction
and the model seems in general to understand the instructions in that it produces correct results for some of the text excerpts. However it also produces the wrong number of results: if I supply 100 texts, I may get back less or sometimes more(!) than 100 results. The model seems to work better when I limit my texts to 10.
My text excerpts are generally small (sentences) and the results are expected in the application/json
format and fit within the max_output_tokens
limit (8192). The model that I am currently evaluating is gemini-2.0-flash
. I am using it via the Python SDK (google-genai==1.11.0
).
I do not know if the model simply loses count or if I am misusing it in some way. I have tried to supply the text excerpts as individual parts and also as a single text with an ad-hoc separator (e.g. the text ---BOUNDARY---
) between excerpts.
Any guidance greatly appreciated.