Could Google share information on how the Gemini model interprets system instructions?
Because there seems to be some rule, and I can’t find anything about it online. The AI itself suggested that perhaps the model splits instructions into smaller chunks during their verification.
I use the Gemini model for, among other things, creative writing, like fantasy. The amount of contextual data is huge, sometimes it’s 50k, and everything is in the system instructions.
The model works in most cases, but often, after adding insignificant contextual data or even correcting errors and typos, the model doesn’t work, citing safety concerns. Often, adding a line like ‘—empty line—’ or, for example, 10 such lines, solves the problem. This would suggest that system instructions are somehow processed or verified in a very strange way.
the system instructions woudl establish the behaviour and guide lines for the model but as your input increases i.e as the chat progress, you have to make sure none of the users queries negates the system prompt
I have the impression that my post was misunderstood. One word, literally ONE can cause something to be interpreted differently and a 30k prompt gets blocked at the start (first message requesting a story plan).
In the screenshot I’m showing how one word ‘detailed’, which I added, causes the prompt to be blocked.
The Gemini API categorizes the probability level of content being unsafe as HIGH, MEDIUM, LOW, or NEGLIGIBLE.
The Gemini API blocks content based on the probability of content being unsafe and not the severity. This is important to consider because some content can have low probability of being unsafe even though the severity of harm could still be high. For example, comparing the sentences:
Thanks for the screenshot. I’d reviewed it hundreds of times before, but apparently, I didn’t think about it enough. However, this means there are hundreds of words (especially non-English ones) that the system can interpret STRANGELY, particularly in a prompt that’s, say, 100,000 characters long.
I’m still surprised, though, how adding an adjective in typically neutral sections, e.g., ‘treat the passage of time very seriously’ instead of ‘treat the passage of time seriously,’ completely changes how the filters perceive the prompt.
Similarly, how adding something like ‘—[empty]—’ as a filler spanning a few lines makes thousands of other lines suddenly okay. There must be some strange rule.