Hi everyone,
I’ve been building a RAG application using the Gemini API’s file_search tool and ran into a frustrating, undocumented edge case.
When using a long, production-ready system prompt, the API successfully uses the file search content to generate the answer, but silently drops all grounding_metadata (chunks and supports). This completely breaks inline citations and source attribution.
Environment Details
-
Model:
gemini-3-flash-preview -
SDK:
google-genai 1.69.0
The Issue
It appears to be an attention dilution issue. When the system_instruction contains a massive, highly specific set of formatting rules, those explicit instructions heavily outweigh the model’s internal instructions to attach grounding metadata.
Systematic Testing & Root Cause
I ran a series of systematic tests adjusting the system prompt length and placement. Here are the results:
| Test | System prompt | Grounding Metadata Returned? |
|---|---|---|
| User prompt only + hint | None (0 chars) | YES (5 chunks) |
| User + short system (961 chars) | Short | YES |
| User + real production system (6,906 chars) | Long | NO |
| Concatenated with full system | Full | NO |
system_instruction with full system |
Full | NO |
Conclusion: The 7k-character production system prompt causes Gemini to skip the grounding metadata entirely. This happens regardless of AFC (Automatic Function Calling) settings, and regardless of whether the prompt is placed in system_instruction or concatenated directly into the user prompt.
Expected vs. Actual Behavior
-
Expected: The API should return
grounding_metadata.grounding_chunksandgrounding_supportsso developers can build inline citations, regardless of the system prompt length. -
Actual: The model generates the correct answer utilizing the retrieved file search content, but the
grounding_metadataobject is empty or missing.
Documentation Check
I reviewed both the File Search Cookbook and the Google Search Grounding documentation, and there is no mention of a character limit for system prompts before grounding attribution fails.
Has anyone else encountered this instruction overshadowing issue with the file_search or Google Search tools? Are there any known workarounds other than severely truncating the system instructions?
Thanks in advance for any insights!