Gemini 3.1 Pro vs Claude Opus — Instruction following drops on complex system prompts. Is this being worked on?

Mohamed_Eldegla · April 7, 2026, 11:53am

Hey everyone,

I’ve been using Gemini 3.1 Pro High and Claude Opus 4.6 interchangeably as an Ultra User within Antigravity with a significant divergence in how they handle system instructions.

My setup uses a dense, 4,000-word system prompt (a “Context Constitution”) covering strict coding standards, file reading policies, and a mandatory multi-step planning workflow.

When I use Claude, it follows nearly every rule meticulously. It goes through the planning phases, reads files fully before editing, and writes comprehensive answers.

When I switch to Gemini using the exact same constraints, adherence drops noticeably. Specifically, Gemini tends to:

Skip workflow steps: It often jumps straight to writing code without the mandatory research or planning phases.
Write shorter output: It optimizes for speed, producing minimal plans even when explicitly told to be comprehensive.
Lose context in longer sessions: It starts strong but frequently forgets system instructions established earlier in the conversation.
Execute “helpful overrides”: It sometimes ignores explicit negative constraints (like “never refactor unrelated code”) feeling its own approach is better.

I’ve checked the benchmarks, and while Gemini scores well on standard IFEval, independent tests on complex nested constraints (which my rules basically are) show a significant drop to around ~78%. This lines up exactly with my real-world experience.

My questions for the team:

Is improving instruction following for dense, complex system prompts on the roadmap for the Gemini 3.x line?
Is there a recommended way to structure large instruction sets to get better compliance out of Gemini? (I’ve tried moving critical rules to the end with limited success).
Would adjustable “effort” or “thoroughness” parameters (similar to thinking budgets) help address this by forcing the model to process instructions more carefully?

I love Gemini’s speed and massive context window, but for structured, rule-heavy workflows, the instruction following gap with Claude is the main blocker. Would love to know if this is on the radar.

Thanks.

YNd · April 7, 2026, 12:24pm

Can confirm. Google’s models consistently struggle with complex instruction following, even in fresh sessions with relatively straightforward tasks. I’ve had to completely re-engineer my workflow to deal with the current quota cuts and Gemini’s lack of adherence.

To survive on an Ultra sub, I’ve moved to a ‘Claude-led, Gemini-fed’ pipeline:

Discovery & Epic Drafting (Claude Opus): We break down features into granular, testable specs (.md files) through a 3-stage interview process.
Audit (Claude Opus): A separate session where Claude runs Python-based checks on the specs for security, side effects, and ‘future-proofing.’ It produces a ‘brain-artifact’ for copy-pasting.
Implementation (Gemini 3.1 High): I feed one tiny spec at a time into a new Gemini session. It still messes up, but it’s manageable for small, isolated tasks.
Verification (Claude Opus): I never let Gemini ‘guard its own henhouse.’ I use Claude in a new session to audit Gemini’s code and sync docs.

This multi-session approach is the only way I can close 1–2 medium epics per 5-hour window without burning through Claude’s tiny quota or losing my mind over Gemini’s hallucinations.

In counclusion

Every small, testable step in a new session with detailed, dedicated instruction is the only way Gemini can work nearly properly

Abhijit_Pramanik · April 7, 2026, 6:42pm

Hello @Mohamed_Eldegla @YNd, welcome to AI Forum!
Thank you for bringing these concerns to our attention. Please be assured that I have shared your feedback with our internal team for further review.
We appreciate your continued patience as we work to enhance the Antigravity experience.

Topic		Replies	Views
Feedback: system instructions Google AI Studio api	7	636	September 25, 2024
Gemini 3.1 Pro ignores instructions + thoughts on the "thought process"🤔 Google AI Studio feedback , bug , models , gemini-api , gemini	3	114	April 24, 2026
Gemini 3 not adhering to system prompts Google AI Studio ai-studio , api , models , gemini , prompt	4	607	December 11, 2025
Antigravity GEMINI models are completely useless in the past 1-2 weeks Google Antigravity bug , gemini	8	1064	March 2, 2026
Gemini System Prompt 問題報告 \| Gemini System Prompt Issue Report Google AI Studio prompt , gemini-3	1	80	January 9, 2026

Gemini 3.1 Pro vs Claude Opus — Instruction following drops on complex system prompts. Is this being worked on?

Related topics