I discovered something interesting about Gemini’s API, especially when compared to OpenAI’s API. OpenAI’s approach is very straightforward. You can include user messages, assistant messages, or system messages in your message array, and the API knows how to handle each one. System messages, for instance, act as a knowledge base for the AI, while user messages are inputs from the user, and assistant messages serve as hints or prompts for the AI to respond. It’s simple and consistent.
However, with Google Gemini, the system messages don’t quite work the same way. In Gemini’s API, you can include system instructions, but only once at the very beginning of the conversation when you’re queuing up the model. The problem is that these system instructions can be easily overridden as the conversation progresses. For example, you could set a system instruction specifying the exact time of day, but if the user later says, “No, it’s 3:00 p.m.,” the assistant will adopt the user’s input and continue the conversation as if it were 3:00 p.m. This behavior makes the system instructions less reliable.
What’s even more concerning is how Gemini handles logic within a conversation. Suppose the system message specifies that the time is 1:00 p.m. If you ask the assistant what time it is, it will respond with “1:00 p.m.” as instructed. But if you check again later when it’s actually 2:00 p.m., the assistant might respond with “1:05 p.m.” It seems to ignore its system instructions and instead tries to infer time progression based on its previous responses, which can lead to inaccuracies. This is problematic because the assistant isn’t referencing the original system instruction; it’s essentially making things up based on its last interaction.
This behavior creates additional challenges. To use Gemini effectively, you have to write your own syntax to explicitly instruct the model on how to prioritize messages. For example, you might need to include instructions like, “Prioritize this message and everything above it; do not use anything after this point to build your knowledge base.” Adding such explicit instructions takes up tokens, which can quickly add up. While you might save on tokens in some areas, you’re likely to spend more just explaining how the AI should handle its own system messages.