Hallucinated grounding references

ivan_b · April 16, 2025, 5:42pm

When grounding is used with Pro 2.5, it often references URLs that are not provided in the “grounding chunks”. For example, it write: “Some information [25, 29]…”, but only ~10 references were included in the grounding metadata.

I assume this is because the model sees all references that were found, but only some of them are returned. Is there any way to fix this elegantly? Either to have all references returned, or to stop the model from writing these references out? Even with some heavy prompting, is still sometimes outputs them.

GUNAND_MAYANGLAMBAM · April 17, 2025, 4:08am

Hey @ivan_b , I checked three to four prompts in AI Studio using the 2.5-pro model, and it seems to give all the required references. Could you share the prompt you tested?

ivan_b · April 17, 2025, 9:15am

Sorry for my post above, I was replying to the wrong thread. Here is an example:

prompt = "Summarize the reviews of the recently awarded 97th Oscar winner using the search tool. Cite your sources in the text."

client = genai.Client(api_key=os.environ.get("GEMINI_API_KEY"))
model = "gemini-2.5-pro-preview-03-25"
contents = [
    types.Content(
        role="user",
        parts=[types.Part.from_text(text=prompt)],
    ),
]
tools = [types.Tool(google_search=types.GoogleSearch())]
generate_content_config = types.GenerateContentConfig(
    tools=tools,
    response_mime_type="text/plain",
)
response = client.models.generate_content(
    model=model,
    contents=contents,
    config=generate_content_config,
)
print(response.text)
print(len(response.candidates[0].grounding_metadata.grounding_chunks))

Output of this often uses citations with indexes larger than the number of grounding chunks. This often happens even without explicitly asking the model to use the search tool or to cite the sources (I do it just to make this error happen more frequently in this example).

Same thing happens in the AI Studio, where the system waits for the whole output to finish, and then in-places the references from the grounding segments. When the above prompt runs in the AI studio, you get a mixture of model-generated references and those in-placed grounding segments, which ends up looking like this:

Note how the model outputted references to e.g. source 11 and 13, while the returned references only go up to 8. Also note how the in-placed references (that system adds after generation is finished) are proper links, while the model-generated ones cannot be proper links as the model does not have access to URLs (I guess to prevent leaking them, as Google requires all these URLs to go through a vertex proxy, which I don’t fully understand why).

I assume all this stems from the following:

Model is given a lot more references than the API response contains, and in different order.
Model has learned that referencing sources is a good thing, so it does that, creating conflicts with the system-level grounding mechanics.
Google (for some reason) does not want to include all the sources given to the model in the response. Docs say so explicitly, but do not give any justification why.

ivan_b · April 29, 2025, 9:09am

Updating this to say that the problem still persists.

Daniel_LaCosse · April 29, 2025, 3:53pm

Also experiencing this issue. Quite frustrating as there’s no way to fix it in either direction: I would simply remove the citation numbers if I knew which ones to remove.

Daniel_LaCosse · April 30, 2025, 4:04pm

I’ll note that sending the response back through Gemini again, asking it to remove the citation numbers, seems to work - a bit slow, but acceptable.

GUNAND_MAYANGLAMBAM · May 2, 2025, 7:13am

I have escalated to the engineering team.

Topic		Replies	Views
Gemini fabricating URLs, even with search grounding Google AI Studio gemini	1	151	April 29, 2025
Search Grounding metadata is empty even when search was performed Gemini API api , ground-search	5	131	June 2, 2025
About 「Grounding with Google Search」--where is the limit? Google AI Studio models , ground-search	9	249	April 29, 2025
Gemini 2.5 Pro Preview 05-06 (Grounding with Google Search) Google AI Studio bug , grounding	3	188	May 23, 2025
Gemini 2.5 Grounding Segments Indices Incorrect (with Google Search) Gemini API api , models , python	1	73	May 21, 2025

Hallucinated grounding references

Related topics