Massive miss in calculating Social Security benefits--WEP/GPO overlooked vs Claude, Chatgpt and Perplexity

Karma203 · March 29, 2026, 12:51pm

MODEL PERFORMANCE DEFICIENCY REPORT
Topic: Social Security / International Totalization / Legislative Update
Case ID: WEP/GPO Repeal (Social Security Fairness Act of 2025)

ISSUE:
The model failed to prioritize a major federal legislative change (Repeal of WEP/GPO, signed Jan 2025) over its legacy training data regarding the Windfall Elimination Provision. Even when provided with a “2026” system date and a “Quality Assurance” prompt designed to catch errors, the model hallucinated a benefit reduction (WEP) that no longer exists in the current legal environment.

CRITICAL WEAKNESSES:

Temporal Logic: Failure to reconcile the “current date” (2026) with the status of active legislation.
Cross-Domain Verification: Failure to verify “standard” financial rules against recent legal overrides.
Instruction Following: The model ignored the “Q A Prompt” discipline which should have triggered a search for conflicting facts (like the Social Security Fairness Act).
Also did not follow the standard Quality Assurance prompts user had requested Gemini to follow to prevent hallucinations.

REQUESTED FIX:
Improve the model’s ‘recency weight’ for federal law and financial regulations. Ensure that financial/legal “standard practices” are cross-referenced with recent legislative triggers during the reasoning phase. Also ensure in these more nuanced analysis the model does not hallucinate and follows the Quality assurance prompts a more sophisticated user had requested as base performance

Topic		Replies	Views
Gemini vs. Grok — Severe Instruction Adherence and Data Extraction Failures in Financial Analysis Gemini API gemini , gemini-3	1	108	February 26, 2026
Gemini 2.0 flash - 1.5 pro Struggles with Basic Task Execution Gemini API gemini-15 , api , models	1	157	May 19, 2025
CRITICAL AUDIT: Architectural Design Flaws & Churn Risk Alert Gemini API feedback , api , models , gemini , ai	0	18	May 28, 2026
Critical Failure in Instruction Following & Negative Constraints Adherence (Gemini) Gemini API feedback , gemini	2	254	December 29, 2025
Regarding Gemini's Core Logic and Major Errors in Data Judgment Gemini API bug , api	0	100	December 4, 2025

Massive miss in calculating Social Security benefits--WEP/GPO overlooked vs Claude, Chatgpt and Perplexity

Related topics