[Question] Gemini File Search tool_use_prompt_token_count is unexpectedly high — what is actually happening internally?

Environment

  • Model: gemini-3-flash-preview
  • Chunk size: 512, overlap: 50, top_k: 3

Observed behavior

I ran three tests and tracked tool_use_prompt_token_count:

Case Files in store Query k Tool tokens
1 hwpx only Related to hwpx 5 3,250
2 xlsx only Unrelated to xlsx 3 355,631
3 hwpx + xlsx Same query as Case 1 3 19,960

Two things stand out:

Case 1 vs Case 3: Same document, same query. The only change is that an unrelated Excel file was added to the store. Tool tokens increased ~6x (3,250 → 19,960).

Case 2: When the query is completely unrelated to the stored document, tool tokens jumped to 355,631.


Why this is confusing

File Search should work via vector similarity search:

  1. Embed the query

  2. Retrieve top-k chunks → chunk size 512 × k=3 = ~1,536 tokens max

  3. Pass those chunks as context

Under this model, adding an unrelated file to the store should have zero effect on tool token count — vector similarity search doesn’t require reading all documents. Yet Case 1 vs Case 3 clearly shows otherwise.


What I can’t figure out

The internal retrieval process is completely opaque. Is there re-ranking happening? Full document scanning? Something else? Without understanding what’s driving these token counts, it’s impossible to predict costs or confidently adopt File Search in production.

Has anyone managed to get clarity on this? Any insight would be appreciated.

1 Like