Comparing OpenAI's Image Generation with Gemini

Hello,

I’m curious whether OpenAI’s image generation model is significantly more advanced than Gemini’s, or if I might not be using Gemini correctly. Could you clarify the differences or suggest best practices for using Gemini effectively?

OpenAI
======

    client = OpenAI(api_key=OPEN_AI_KEY)

    prompt = "Turn this image into Ghibli-style animation art"

    model="gpt-image-1"

    result = client.images.edit(
        model=model,
        image=open("input.jpg", "rb"),
        prompt=prompt
    )

    image_base64 = result.data[0].b64_json
    image_bytes = base64.b64decode(image_base64)

    # Save the image to a file
    with open("output.jpg", "wb") as f:
        f.write(image_bytes)



Gemini
======
    client = genai.Client(api_key=API_KEY)

    image = Image.open("input.jpg")

    prompt = "Turn this image into Ghibli-style animation art"

    response = client.models.generate_content(
        model='gemini-2.0-flash-exp-image-generation',
        contents=[prompt, image],
        config=types.GenerateContentConfig(
            response_modalities=['Text', 'Image']
        )
    )

    for part in response.candidates[0].content.parts:
        if part.text:
            print(part.text)
        elif part.inline_data:
            result_image = Image.open(BytesIO(part.inline_data.data))
            result_image.save('output.jpg')
            result_image.show()

Imgur: The magic of the Internet - Open AI output (good)

Imgur: The magic of the Internet - Gemini output (bad)

@Yan_Cheng_Cheok,

welcome to the community, Thank you for reaching out.

its not that one model is better or worse than the other. Models perform well on the data they were trained/fine-tuned on.

in this case,I believe the OPENAI model was fine tuned on gibli style animation/art but GEMINI is not fine-tuned for making art. its is intended for overall realistic image generation

if you want specific style you can try open models that are fine-tuned to the specific style you need.

Note: Checkout Huggingface or Civit.ai for these fine tuned models and they will perform better than any api when it comes to that specific style for all else they would underperform.

1 Like