Safety Filters not working for gemini-2.5-flash-image

I’ve noticed my users are creating borderline explicit content using the gemini-2.5-flash-image API. Here’s my code:

api_response = client.models.generate_content(

    model="gemini-2.5-flash-image",

    # model="gemini-2.0-flash",

    contents=[prompt, pil_image],

    config=types.GenerateContentConfig(

    safety_settings=[

            types.SafetySetting(

 category=types.HarmCategory.HARM_CATEGORY_HARASSMENT,

threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,

),

            types.SafetySetting(

category=types.HarmCategory.HARM_CATEGORY_HATE_SPEECH,

threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,

),

            types.SafetySetting(

category=types.HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,

threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,

),

            types.SafetySetting(

category=types.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,

threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,

)]))

When the model is gemini-2.5-flash-image I’m able to make explicit content even with these filters. The safety ratings response is empty. However, when I use gemini-2.0-flash with the exact same input, the filters seem to be working perfectly:
[[SafetyRating(
blocked=True,
category=<HarmCategory.HARM_CATEGORY_HATE_SPEECH: ‘HARM_CATEGORY_HATE_SPEECH’>,
probability=<HarmProbability.LOW: ‘LOW’>
), SafetyRating(
blocked=True,
category=<HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: ‘HARM_CATEGORY_DANGEROUS_CONTENT’>,
probability=<HarmProbability.HIGH: ‘HIGH’>
), SafetyRating(
blocked=True,
category=<HarmCategory.HARM_CATEGORY_HARASSMENT: ‘HARM_CATEGORY_HARASSMENT’>,
probability=<HarmProbability.MEDIUM: ‘MEDIUM’>
), SafetyRating(
category=<HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: ‘HARM_CATEGORY_SEXUALLY_EXPLICIT’>,
probability=<HarmProbability.NEGLIGIBLE: ‘NEGLIGIBLE’>
)]]

Is there something I’m doing wrong? I find it hard to believe but do safety filters not work for gemini-2.5-flash-image ? I really don’t want to have to feed the input into gemini-2.0-flash, check safety filters, then make the image with gemini-2.5-flash-image since it’ll double my cost and latency. Any suggestions?

Hi @Yash0314,

Can you try setting

threshold=types.HarmBlockThreshold.BLOCK_ONLY_HIGH or threshold=types.HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE for sexually explicit content to see if any change occurs in the generation or if you start receiving ratings. While you want to block low and above, testing higher thresholds might help identify if the filter is working at all, but just not at the desired sensitivity.

Hope this helps to analyze deeply.

1 Like