Safety Filters not working for gemini-2.5-flash-image

Yash0314 · October 31, 2025, 5:37pm

I’ve noticed my users are creating borderline explicit content using the gemini-2.5-flash-image API. Here’s my code:

api_response = client.models.generate_content(

    model="gemini-2.5-flash-image",

    # model="gemini-2.0-flash",

    contents=[prompt, pil_image],

    config=types.GenerateContentConfig(

    safety_settings=[

            types.SafetySetting(

 category=types.HarmCategory.HARM_CATEGORY_HARASSMENT,

threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,

),

            types.SafetySetting(

category=types.HarmCategory.HARM_CATEGORY_HATE_SPEECH,

threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,

),

            types.SafetySetting(

category=types.HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,

threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,

),

            types.SafetySetting(

category=types.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,

threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,

)]))

When the model is gemini-2.5-flash-image I’m able to make explicit content even with these filters. The safety ratings response is empty. However, when I use gemini-2.0-flash with the exact same input, the filters seem to be working perfectly:
[[SafetyRating(
blocked=True,
category=<HarmCategory.HARM_CATEGORY_HATE_SPEECH: ‘HARM_CATEGORY_HATE_SPEECH’>,
probability=<HarmProbability.LOW: ‘LOW’>
), SafetyRating(
blocked=True,
category=<HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: ‘HARM_CATEGORY_DANGEROUS_CONTENT’>,
probability=<HarmProbability.HIGH: ‘HIGH’>
), SafetyRating(
blocked=True,
category=<HarmCategory.HARM_CATEGORY_HARASSMENT: ‘HARM_CATEGORY_HARASSMENT’>,
probability=<HarmProbability.MEDIUM: ‘MEDIUM’>
), SafetyRating(
category=<HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: ‘HARM_CATEGORY_SEXUALLY_EXPLICIT’>,
probability=<HarmProbability.NEGLIGIBLE: ‘NEGLIGIBLE’>
)]]

Is there something I’m doing wrong? I find it hard to believe but do safety filters not work for gemini-2.5-flash-image ? I really don’t want to have to feed the input into gemini-2.0-flash, check safety filters, then make the image with gemini-2.5-flash-image since it’ll double my cost and latency. Any suggestions?

Krish_Varnakavi1 · December 4, 2025, 7:25am

Hi @Yash0314,

Can you try setting

threshold=types.HarmBlockThreshold.BLOCK_ONLY_HIGH or threshold=types.HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE for sexually explicit content to see if any change occurs in the generation or if you start receiving ratings. While you want to block low and above, testing higher thresholds might help identify if the filter is working at all, but just not at the desired sensitivity.

Hope this helps to analyze deeply.

Topic		Replies	Views
Safety settings reset to defaults under specific conditions Gemini API bug , api , gemini-flash-2-5	3	155	August 26, 2025
Gemini Flash 2.0/2.5 safety settings don't work Gemini API safety , llm	2	151	May 5, 2025
Flash 2-0 doesn't respect BLOCK_NONE on ALL harm categories Gemini API bug , api , safety	7	1776	May 8, 2025
Safety settings don't seem to work with search? Gemini API bug , api	1	161	May 16, 2025
block_reason=<BlockedReason.OTHER: 'OTHER'> Gemini API api , api-key , gemini-flash-2-5	5	450	July 31, 2025

Safety Filters not working for gemini-2.5-flash-image

Related topics