The System Instructions Fiasco with Gemini vs competitors

MikeAG · December 20, 2024, 5:19am

I discovered something interesting about Gemini’s API, especially when compared to OpenAI’s API. OpenAI’s approach is very straightforward. You can include user messages, assistant messages, or system messages in your message array, and the API knows how to handle each one. System messages, for instance, act as a knowledge base for the AI, while user messages are inputs from the user, and assistant messages serve as hints or prompts for the AI to respond. It’s simple and consistent.

However, with Google Gemini, the system messages don’t quite work the same way. In Gemini’s API, you can include system instructions, but only once at the very beginning of the conversation when you’re queuing up the model. The problem is that these system instructions can be easily overridden as the conversation progresses. For example, you could set a system instruction specifying the exact time of day, but if the user later says, “No, it’s 3:00 p.m.,” the assistant will adopt the user’s input and continue the conversation as if it were 3:00 p.m. This behavior makes the system instructions less reliable.

What’s even more concerning is how Gemini handles logic within a conversation. Suppose the system message specifies that the time is 1:00 p.m. If you ask the assistant what time it is, it will respond with “1:00 p.m.” as instructed. But if you check again later when it’s actually 2:00 p.m., the assistant might respond with “1:05 p.m.” It seems to ignore its system instructions and instead tries to infer time progression based on its previous responses, which can lead to inaccuracies. This is problematic because the assistant isn’t referencing the original system instruction; it’s essentially making things up based on its last interaction.

This behavior creates additional challenges. To use Gemini effectively, you have to write your own syntax to explicitly instruct the model on how to prioritize messages. For example, you might need to include instructions like, “Prioritize this message and everything above it; do not use anything after this point to build your knowledge base.” Adding such explicit instructions takes up tokens, which can quickly add up. While you might save on tokens in some areas, you’re likely to spend more just explaining how the AI should handle its own system messages.

TheLongSentance · January 3, 2025, 6:25pm

Mike, thanks for this and I agree that the Gemini api is lacking in this respect when compared to OpenAI and other model providers. However, couldn’t you just call gemini_api = google.generativeai.GenerativeModel( model_name=model_version, system_instruction=system_prompt)) every single time you want to update the system prompt before generating a new gemini response with message = gemini_api.generate_content(formatted_conversation)? Certainly a simple solution, but maybe one that is slow, expensive or doesn’t suit your needs? Anyway, just an idea and one that you may have already tried. Best wishes!

Topic		Replies	Views
Making the Switch From OpenAI.....i miss system prompt Gemini API open-models , prompt	2	214	January 9, 2025
Expert opinion on System Instruction Gemini API	2	282	May 22, 2024
Feedback: system instructions Google AI Studio api	7	359	September 25, 2024
Is there a way to tell the API to prioritize answers based on the context/system_instruction over chat history? Gemini API gemini-15	4	146	May 1, 2024
Multiple system prompt messages not supported by Gemini OpenAI compatibility Gemini API api , openai_compatibility	1	78	May 30, 2025

The System Instructions Fiasco with Gemini vs competitors

Related topics