Summary
Building a multi-document research assistant using Gemini API with File Search for RAG. Discovered undocumented retrieval limits that significantly impact citation verifiability.
Observed Behavior
| Aspect |
Observed |
| Chunks retrieved per query |
~5 (consistent across tests) |
| Unique documents retrieved |
2-3 max, even when more are relevant |
| metadata_filter effect |
Constrains candidate pool, doesn’t increase retrieval count |
| grounding_supports |
Only indexes 1-2 chunks, not all retrieved |
Test Case
Setup: 15+ papers uploaded to File Search corpus
Query: “How are the works of Simon, Mintzberg, Lipshitz, and Basadur related?”
Expected: Chunks from all 4 authors (all papers exist in corpus)
Actual:
- File Search retrieved 2 documents (Simon, Basadur)
- Gemini cited all 4 authors in response
- 2 citations had grounding data, 2 came from model training knowledge
With metadata_filter selecting all 4 papers:
- Still retrieved only 2 documents
- Different papers selected, but count unchanged
Questions
- Is there an API parameter to increase the number of chunks retrieved per query?
- Is the ~5 chunk / 2-3 document limit documented somewhere I missed?
- Is there a way to disable training data fallback so responses only cite from File Search results?
- Does grounding_supports intentionally index fewer chunks than grounding_chunks returns?
Why This Matters
Our product promise is verifiable citations — users hover over [1], [2] markers to see the source passage. When Gemini cites papers from training data instead of File Search, we can’t provide that verification. Currently achieving ~50% citation verifiability due to these limits.
Documentation Reviewed
Neither documents retrieval limits or grounding_supports behavior.
Happy to provide logs or a minimal reproduction if helpful.
Hi @Roy_Naquin, welcome to the community!
Thank you so much for flagging the issue.
To analyze the issue further, could you please help us with providing logs and also minimal reproduction steps?
Thanks again!
Hi @Srikanta_K_N
Thank you for the quick response! Here are the reproduction steps and logs.
Minimal Reproduction Steps
-
Create a File Search store and upload 10+ PDF documents (academic papers work well)
-
Send a query that requires information from multiple documents:
How are the works of Simon, Mintzberg, Lipshitz, and Basadur related?
Pick one paper from each, provide an overview, and discuss the connections.
- Inspect
grounding_metadata in the response
Observed Behavior (Consistent Across 3 Test Cases)
| Metric |
Value |
Pattern |
grounding_chunks returned |
5 |
Always 5 |
| Unique documents in chunks |
3 |
Always ~3 |
Unique indices in grounding_supports |
2 |
Always [0, 1] only |
[N] markers in response text |
4-7 |
Variable |
Logs
Test Case 1 — Multi-author query:
FILE SEARCH DEBUG: Retrieved 5 chunks
FILE SEARCH DEBUG: From 3 unique documents:
- Rationality as Process and as Product of Thought
- A Behavioral Model of Rational Choice
- The Structure of "Unstructured" Decision Processes
grounding_chunks count: 5
Chunk 0: title='A Behavioral Model of Rational Choice'
Chunk 1: title='The Structure of "Unstructured" Decision Processes'
Chunk 2: title='Rationality as Process and as Product of Thought'
Chunk 3: title='A Behavioral Model of Rational Choice'
Chunk 4: title='Rationality as Process and as Product of Thought'
grounding_supports count: 2
Support 0: chunk_indices=[0]
Support 1: chunk_indices=[1]
Response contained 4 citation markers [1], [2], [3], [4] but only [1] and [2] map to grounding_chunks. The model cited Lipshitz and Basadur papers (which exist in our store) but File Search didn’t retrieve them.
Specific Questions
-
Is there an API parameter to increase chunk retrieval count? We consistently get 5 chunks from ~3 documents regardless of query complexity or store size.
-
Is the ~5 chunk / ~3 document limit documented? We couldn’t find this in the File Search documentation.
-
Why does grounding_supports only reference chunks [0, 1]? Even when 3 unique documents are retrieved (chunks 0, 1, 2), grounding_supports.grounding_chunk_indices never includes chunk 2.
-
Is there a way to disable training data fallback? When File Search doesn’t retrieve enough documents, the model cites papers from training data using [N] markers, but these aren’t in grounding_chunks.
Environment
Happy to provide additional logs or a minimal code sample if helpful.