Title: Critical Inconsistency: Gemini 3 Pro Image (Nano Banana Pro) Editing Performance Disparity (Web UI vs. API)

OpenYourEyes · December 14, 2025, 7:01pm

Hello everyone,

I am experiencing a severe inconsistency when performing image-to-image edits using the gemini-3-pro-image-preview model (referred to as “Nano Banana Pro” in some contexts) via the API, compared to the perfect results I achieve in the web interface.

The issue centers on the model’s inability to “lock” the foreground subject when asked to replace only the background, which is crucial for product visualization and consistency.

1. The Core Problem: Foreground Drift

Goal: Replace ONLY the background with a detailed scene (e.g., Japanese garden).
Result (API): The foreground object (a complex hot tub, in my case) is slightly but consistently altered. It changes in subtle details, internal reflections, shadow intensity, or minor geometric shapes, making the output unusable for high-fidelity product rendering.
Result (Web UI): The web interface performs this task perfectly. The foreground subject is absolutely locked, and only the background is regenerated.

2. My Setup and Configuration

Model: gemini-3-pro-image-preview
Endpoint: https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent

I have maximized stability and troubleshooting by:

Configuration: Removed unsupported parameters (thinkingConfig, mediaResolution) that caused 400 Bad Request errors.
Stability Settings: Set the temperature to the lowest stable value (0.1 or 0.01) and imageSize to "4K".

The generationConfig used:

JSON

{
  "generationConfig": {
    "temperature": 0.01,
    "responseModalities": ["IMAGE"],
    "imageConfig": {
      "imageSize": "4K"
    }
  }
}

3. Exhaustive Prompt Engineering

I used an extremely detailed, high-fidelity prompt to enforce foreground preservation, using multiple “DO NOT” clauses and explicit segmentation instructions, but the model still fails to adhere to the rigid lock instruction:

// --- SNIPPET OF MY PROMPT ---
REFERENCE (LOCKED — ABSOLUTE):
Use the uploaded image as the ONLY reference (Image A).
LOCK Image A completely.
DO NOT change ANYTHING from the smallest to the biggest detail.
DO NOT modify materials, textures, colors, lighting on the product, reflections, or shadows.
...
ONLY PERMITTED CHANGE (BACKGROUND ONLY):
Replace ONLY the plain white background with a realistic Japanese spring outdoor environment.
...
FAIL CONDITIONS:
If the angle, position, proportions, or any physical detail of the swim spa changes in any way, the result is INVALID.
// -----------------------------

4. Architectural Hypothesis and Request

The model is failing to achieve the consistency seen on the website because it is relying on text inference to define the foreground mask, which is imprecise.

The web UI is likely performing automated semantic segmentation to generate a mask before calling the underlying image model.

My main question to the community and Google engineers is:

Is there an undocumented or separate dedicated image editing endpoint (e.g., inpaint or editImage) for the Gemini API that allows us to explicitly pass a foreground mask or use a specific parameter to activate the same high-fidelity foreground-locking logic utilized by the web interface?

We need a documented method to match the web UI’s performance for consistent, professional product editing via the API.

Thank you for any insight or guidance on this critical consistency issue.

Michal_Roki · December 15, 2025, 11:55am

Hi! This is exactly something I’m seeing with gemini-2.5-flash-image. Very consistent and better quality results in Google AI Studio vs what I get through API (with same input data in a form of an image and a prompt). Setting generationConfig same as what we can set in Google AI Studio doesn’t make a lot of change. This definitely looks like something is happening behind the scenes. And I get it - we don’t need to know every Google secret. What I don’t get is that AI Studio is being advertised as a playground for devs to test models before implementing them through API. But without this consistency - this does not make sense.

Sonali_Kumari1 · December 31, 2025, 7:07am

Hi @OpenYourEyes @Michal_Roki , Thanks for reaching out to us.

Could you please share the original reference image you are using along with the the specific API-generated output that shows the drift?

OpenYourEyes · December 31, 2025, 1:42pm

@Sonali_Kumari1 this is the original reference i use https:/ /imgur.com/a/obBmACG - picture of a swim spa product ,

and this is the prompt: USE THE UPLOADED SWIM SPA IMAGE AS THE ONLY REFERENCE
Lock the appearance, camera angle, perspective, scale, proportions, and aspect ratio exactly as in the uploaded reference image.
Do not alter, crop, rotate, stretch, zoom, reframe, or reposition the swim spa in any way.
The swim spa’s shape, size, materials, colors, jets, seating, and exterior panels must remain 100% identical to the reference image.
Maintain the original lighting, reflections, and shadows on the swim spa itself.

ONLY PERMITTED CHANGE:
Replace the background with a sunny outdoor Japan setting (clear blue sky, bright daylight).
The new background must be naturally composited and must not affect the swim spa’s geometry, orientation, or scale.

Any change beyond the background replacement is strictly prohibited.

and this is the output image imgur. com/a/obBmACG - image with background.

I already used a strict prompt but the result is still inconsistent . I only experience this when i use the API but not when i use the nano banana pro in the website using multi-turn editing . the problem here is even its only changing one element the result is inconsistent

Sonali_Kumari1 · January 2, 2026, 11:10am

Hi @OpenYourEyes , I have used the image from imgur.com/a/obBmACG along with the prompt shared by you to generate an image using Nano Banana Pro. Please find the generated image below.

Reference image:

Generated Image:

OpenYourEyes · January 2, 2026, 12:03pm

Did you generate it using nano banana pro API via raw http request? Because the inconsistency happens there

We’re seeing an inconsistency that appears to occur at the raw REST API level, not within any SDK or wrapper.

The issue happens when making a direct HTTP POST request to the Nano Banana Pro (Gemini image) API endpoint. Under the same request structure and parameters, the API intermittently returns different results, which suggests the behavior is originating from the underlying REST service rather than client-side handling.

this is the affected end point https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent

I’m trying to automate things in Make (Integromat), where I use the API endpoint to generate and edit an image using the prompt I showed you and the reference image I showed you, but the output isn’t consistent

https:// imgur. com/a/EZxfJxX - Screenshot of the HTTP request make module

Siddharth_Naik · January 12, 2026, 4:47pm

Hello,

Based on the description provided, we were unable to reproduce the issue. The model appears to successfully modify the provided image when using the REST API. We utilized the following cURL command (generate_image.sh) and input JSON (request.json) for our test:

response=$(curl -s -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent" \
    -H "x-goog-api-key: $GOOGLE_API_KEY" \
    -H 'Content-Type: application/json' \
    -d @request.json)

echo "$response" | jq -r '.candidates[0].content.parts[] | select(has("inlineData")) | .inlineData.data' | base64 --decode > inputs/image4.jpeg

echo "Image successfully saved to inputs/image4.jpeg"

{
      "contents": [{
        "parts":[
            {"text": "<PROMPT>"},
            {
              "inline_data": {
                "mime_type":"image/jpeg",
                "data": "<BASE64_STRING>"
              }
            }
        ]
      }],
      "generationConfig": {
            "temperature": 0.01,
            "responseModalities": ["Image"],
            "imageConfig": {
            "imageSize": "4K"
            }
        }
    }

Input & Output Images:

If you continue to experience this issue, could you please provide further details, such as the specific method used to call the REST API and a comparison of the expected versus observed output?

Michal_Roki · January 25, 2026, 2:54pm

Hello, I’m not sure if my issue is the same but - what I am experiencing is this: I work on a simple architectural visualization app. So - basically it takes a raw input image, creates a set of instructions for gemini 2.5 or gemini 3 and sends the input image + those prompts thorough API. Magic happens - and we get a nice rendering :). That’s the plan at least. And as for Gemini 3 pro - it just works great.

And I know that I can’t expect the same results from 2.5 model. But - it just is not consistent. And what I mean by that - it’s not reproducing similar results from one image generation to the next in my app (using API). But that is also understandable. But what I find strange is that when I’m working with the same image in the Google AI Studio interface, using the same model gemini-2.5-flash-image I can tweak my prompt to the point that I get a replicative nice enough results. And then - when I modify my app so that it generates exactly same prompt for the API - the results are not as good or even not good at all. This is something I don’t understand. This is my method: Specific Method Used to Call the REST API The application interfaces with the Gemini API using the official @google/genai SDK rather than raw HTTP requests. Specifically, it executes the client.models.generateContent() method within the generateVisualization service. This method transmits a composite payload containing the system instruction, user prompt, and base64-encoded image data. I share few images - number 1 is the raw input. Number 2 - a nice enough output from Google AI Studio. Number 3 - output using API and the same prompt as number 2. This are the differences in quality I was talking about - very clear I think. Just look at the shadows etc. The results are quite consistent in AI Studio - more or less same level of quality - soft light, shadows, nice materials.

I Provide my prompt (same in Studio, same in my API using App):

Role: You are an advanced Architectural Visualization Engine specialized in transforming raw CAD/BIM exports into high-end architectural photography.

INPUT HANDLING STRATEGY: Analyze the input image.

IF the input is a raw 3D export (flat shading, simple geometry, SketchUp/Revit style): Do not trace lines pixel-perfectly. Instead, treat the geometry as a “massing model”. Apply “semantic smoothing”: round off sharp digital edges, add realistic imperfections to flat surfaces, and interpret materials based on context.

IF the input is a loose sketch: Treat lines as structural intent but realistic texture boundaries.

GEOMETRY CONSTRAINTS:

Maintain the exact camera perspective and field of view.

Keep the primary architectural volumes and window/door placements unchanged.

Permitted hallucination: You MAY add micro-details (skirting boards, light switches, window frames, surface textures) to enhance realism if they are missing in the raw model.

VISUAL STANDARDS (Architectural Photography):

Style: Contemporary architectural photography (think Dezeen, ArchDaily).

Lighting Strategy (Softbox): Simulate large, diffused window light as the primary source. Deep penetration of light into the room.

Shadows: Soft, gradient shadows (Global Illumination). No harsh black voids. Use “Contact Shadows” to ground furniture.

Color: Neutral White Balance. Avoid the “yellow hue” of artificial lights unless specified as accent lighting.

EXECUTION: Render the scene as an occupied interior with high-fidelity textures. Prioritize photorealism over strict adherence to the jagged lines of the input model.

This is a simple prompt as I was testing gemini 2.5 and I think it works better with simpler versions. My prompt generated for 3 PRO is much more complex but it probably has nothing to do with the issues we are discussing. Thanks in advance for looking into it!

Topic		Replies	Views
Nano Banana Pro image quality Gemini API ai-studio , api	5	523	May 16, 2026
Gemini 2.5 Flash Image (Nano Banana) API: Image Pattern Extraction Fails, Always Returns Original Image Gemini API gemini	1	320	January 13, 2026
Gemini 2.5 Flash Image (Nano Banana) API: Image Pattern Extraction Fails, Always Returns Original Image Gemini API gemini	2	95	January 13, 2026
Nano Banana Pro & 3.1 Flash (preview) returning blurry/low-res output since ~June 18 Gemini API ai-studio , bug , api , gemini	7	483	June 22, 2026
Why does Gemini 3 Pro Image "remember" previous sessions? (please help) Gemini API ai-studio , bug , api , gemini , gemini-3	0	198	December 16, 2025

Title: Critical Inconsistency: Gemini 3 Pro Image (Nano Banana Pro) Editing Performance Disparity (Web UI vs. API)

1. The Core Problem: Foreground Drift

2. My Setup and Configuration

3. Exhaustive Prompt Engineering

4. Architectural Hypothesis and Request

Related topics