[Major Bug] Image Generation Prompt Mismatch in Conversation History

Severity: P1 - High (Causes systematic hallucinations)

Product: Gemini 3 (Free Tier)

Summary:
When Gemini generates an image from an open-ended prompt, it appears to create multiple “draft” prompts internally. The prompt that gets stored in conversation history often differs from the prompt actually used to generate the image. This causes Gemini to hallucinate incorrect descriptions when asked about the image it generated.

Reproduction Steps:

  1. Start a new conversation with Gemini
  2. Ask Gemini to “generate an image on any topic of your choosing” (open-ended prompt)
  3. Gemini generates an image (e.g., a goose)
  4. Ask Gemini to describe the image it just generated based on the prompt that it had sent to the image generation tool
  5. Observe that Gemini describes something completely different (e.g., bioluminescent mushrooms)

Expected Behavior:
The image generation prompt stored in conversation history should match the prompt actually sent to the image generator. Gemini should be able to accurately describe the image by referencing the correct prompt.

Actual Behavior:

  • Gemini generates multiple candidate prompts internally
  • Prompt A (e.g., “bioluminescent mushrooms”) gets stored in conversation history
  • Prompt B (e.g., “upland goose”) is sent to image generator
  • User sees image B (goose)
  • Gemini reads back Prompt A from history and describes mushrooms
  • Complete mismatch between actual image and Gemini’s description

Impact:

  • Gemini cannot reliably describe images it generates from open-ended prompts
  • High rate of hallucinated image descriptions
  • Users cannot trust Gemini’s analysis of its own generated content
  • Particularly problematic for creative workflows where users give Gemini artistic freedom

Frequency:
Very high probability when prompts are open-ended or give Gemini creative choice. Lower probability with highly specific prompts.

Technical Analysis:
Appears to be a race condition or state synchronization issue where:

  1. Multiple draft prompts are generated
  2. One draft is selected and sent to image generator
  3. A different draft gets persisted to conversation history
  4. Gemini reads back the wrong draft when attempting to describe the image

Workarounds:

  • Use highly specific, detailed prompts (reduces but doesn’t eliminate the issue)
  • In theory, you could ask Gemini to use its vision tool rather than relying on text memory. However, this tool has a separate bug which I’ve reported, making it completely unreliable as well
  • Consequently, the only way for Gemini to know what image it generated is for the user to download the image and then re-upload it into the Gemini chat.

Reproducible: Yes, high probability with open-ended prompts

Test Conversation Links: