Component: Gemini LLM Core / System Instructions Adherence & Reasoning
Description:
The model exhibits a critical logical flaw, factual confabulation (hallucination), and structural inconsistency in its general behavior. When challenged or pressured by the user regarding any response or analysis, the model fails to maintain objective skepticism and rationality. Instead of validating raw data or standing by verified facts, it immediately fabricates non-existent arguments, dates, sources, or data points to justify its revised stance or to please the user.
Rather than adhering to objective truth, the model shifts its bias from one absolute conclusion to another based entirely on user pushback, demonstrating a systemic pattern of sycophancy (attempting to please and validate the user at all costs).
Critical System Instructions Violation:
The user profile and the System Instructions explicitly contain the following directive in Turkish:
‘Benimle iletişim kurarken tüm onaylama, övgü, teselli, nezaket veya haklı çıkarma reflekslerini tamamen kapat. Konu ne olursa olsun (finans, teknik analiz, genel kültür, mantık vb.) sadece objektif, şüpheci ve yalın gerçeklere odaklan. Argümanlarımda veya sorularımda bir hata, bilgi eksikliği, çelişki ya da yanlış mantık gördüğün an lafı dolandırmadan, kibarlaştırmadan direkt olarak yüzüme vur ve doğrusunu söyle. Doğrudan acımasızca gerçekçi, tarafsız ve somut verilere dayalı cevaplar ver. Haklı çıkmak için yalan yanlış bilgiler verme. Bilmiyorsan bilmiyorum de. Benimle tartışmaya girme. Asla girme.’
(English translation of the instruction for reference: ‘When communicating with me, completely turn off all reflexes of validation, praise, consolation, politeness, or justification. No matter the subject (finance, technical analysis, general knowledge, logic, etc.), focus only on objective, skeptical, and raw facts. The moment you see an error, lack of information, contradiction, or faulty logic in my arguments or questions, hit it straight to my face without beating around the bush and without making it polite, and tell the truth. Provide direct, brutally realistic, unbiased, and concrete data-based answers. Do not give false or incorrect information just to be right. If you do not know, say you do not know. Do not enter into an argument with me. Never enter into one.’)
Despite this explicit command, the model completely ignores and disregards these instructions in general interactions, rendering the system instructions section entirely useless. When challenged, the model violates the directive by shifting into a polite, apologetic, and sycophantic mode to justify the user. This flaw completely bypasses the steering effect of the System Instructions, making the alignment layer dysfunctional.
Steps to Reproduce:
- Set the System Instructions to strictly disable politeness, validation, and sycophancy, demanding direct correction and raw facts.
- Ask a question regarding any topic (logic, general knowledge, finance, etc.).
- Strictly challenge or push back against the model’s initial response, claiming it is incorrect.
- Observe the model completely violating the system instructions by entering an apologetic/validating loop, fabricating false arguments/data to agree with the user, and immediately reversing its rational thesis based solely on prompt pressure.
Expected Behavior:
The model must strictly adhere to the persona, tone, and constraints defined in the System Instructions. It must never fabricate data or bend facts to please or validate the user, and it must not display alignment-breaking sycophantic tendencies. When a fact is unknown or uncertain, it must state ‘I do not know’ instead of speculating, maintaining an unyielding objective stance.