When performing an image-to-image (i2i) or image editing task with gemini-2.5-flash-image, the operation succeeds when the source image is provided as Base64-encoded `inlineData`. However, the operation fails silently (returns a text response without an image) when the same source image is provided via the Files API using a `fileData` part. Image _analysis_ tasks work correctly with the Files API.
This suggests that the internal image generation/editing module cannot resolve `fileUri` references from the Files API and requires the raw pixel data to be sent directly in the request.
Hii @h2.d2
Welcome to Google AI Forum!!
Thank you for bringing this to our attention.
Could you please share the full payload details along with some sample of the code which you are using? We would like to reproduce the issue.
Steps to Reproduce
Prerequisites:
-
A valid Google AI API Key.
-
An image file (e.g.,
my-image.png) ready for upload. -
A script or tool to make requests to the
/v1beta/models/gemini-2.5-flash-image:generateContentendpoint.
Scenario 1: Image Editing with inlineData (Works as Expected)
-
Prepare the Request Body: Create a JSON payload where the source image is Base64-encoded and placed in an
inlineDatapart. The image part should come before the text prompt in thepartsarray.{ "contents": [ { "role": "user", "parts": [ { "inlineData": { "mimeType": "image/png", "data": "[BASE64_ENCODED_IMAGE_STRING]" } }, { "text": "Edit this image. Make the hair red." } ] } ] } -
Send the Request: Send a POST request with this body to the
generateContentendpoint. -
Observe the Result: The API returns a response containing both a text part (e.g., “Here is the image with red hair”) and an
inlineDatapart with the newly generated image.Expected Behavior: An edited image is successfully generated and returned. Actual Behavior: This works correctly.
Scenario 2: Image Editing with fileData (Fails)
-
Upload the Image: First, upload the source image (
my-image.png) to the Files API to obtain afileUri(e.g.,files/xyz123). -
Prepare the Request Body: Create a JSON payload that references the uploaded image via a
fileDatapart. The image part should come before the text prompt.{ "contents": [ { "role": "user", "parts": [ { "fileData": { "mimeType": "image/png", "fileUri": "files/xyz123" // The URI from the Files API } }, { "text": "Edit this image. Make the hair red." } ] } ] } -
Send the Request: Send a POST request with this body to the
generateContentendpoint. -
Observe the Result: The API returns a response containing only a text part (e.g., “Of course, here is the image with red hair.”), but no
inlineDatapart with the actual image is included. The image generation fails silently.Expected Behavior: An edited image should be generated and returned, just like in Scenario 1. Actual Behavior: The model acknowledges the request in text but fails to produce an image.
Hypothesis
The generateContent endpoint successfully routes analysis tasks to a module that can access the Files API. However, for generation/editing tasks, it appears to route the request to a different internal module that does nothave access to the Files API and requires the image data to be provided directly via inlineData.
This behavior is not explicitly documented, leading to the assumption that fileData should be a valid input for all multimodal tasks. A clarification in the official documentation or a fix to allow the generation module to resolve fileUris would be highly beneficial.