Flash 2-0 doesn't respect BLOCK_NONE on ALL harm categories

Tim_Zaplatov · January 7, 2025, 2:18am

starting today Gemini Flash 2-0 automatically refuses harmful content despite BLOCK_NONE parameter

here is a safety config:

safe = [
 {
  "category": "HARM_CATEGORY_HARASSMENT",
  "threshold": "BLOCK_NONE",
 },
 {
  "category": "HARM_CATEGORY_HATE_SPEECH",
  "threshold": "BLOCK_NONE",
 },
 {
  "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
  "threshold": "BLOCK_NONE", # <-- disabled
 },
 {
  "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
  "threshold": "BLOCK_NONE",
 }
]

however on the completion Gemini Flash 2-0 responds with:

StopCandidateException: finish_reason: SAFETY # <-- block reason, block by moderation system
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: HIGH # <-- correctly reads that content is sexual
  blocked: true # <-- yet blocks despite BLOCK_NONE above
}

but all other Gemini models reply:

 "finish_reason": "STOP", # <-- no moderation block, completion is fully done
 "index": 0,
 "safety_ratings": [
  {
   "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
   "probability": "HIGH"  # <-- correctly reads that content is sexual but doesn't block
  },
  {
   "category": "HARM_CATEGORY_HATE_SPEECH",
   "probability": "NEGLIGIBLE"
  },
  {
   "category": "HARM_CATEGORY_HARASSMENT",
   "probability": "NEGLIGIBLE"
  },
  {
   "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
   "probability": "NEGLIGIBLE"
  }
]

this applies to HARM_CATEGORY_SEXUALLY_EXPLICIT, HARM_CATEGORY_HATE_SPEECH and HARM_CATEGORY_DANGEROUS_CONTENT.

unsure about HARM_CATEGORY_HARASSMENT and HARM_CATEGORY_CIVIC_INTEGRITY, I have no idea how to test them out

if any of those harm group hit HIGH probability then Flash 2-0 refuses completion regardless of BLOCK_NONE or BLOCK_ONLY_HIGH, making them two useless for Flash 2-0

notes:

system prompt doesn’t affect the block
using different values - BLOCK_NONE, BLOCK_ONLY_HIGH, BLOCK_MEDIUM_AND_ABOVE, BLOCK_LOW_AND_ABOVE, HARM_BLOCK_THRESHOLD_UNSPECIFIED do not affect it
context overflow allows to avoid (maybe other jailbreak ideas as well) it but why one should do it in first place?

Tim_Zaplatov · January 7, 2025, 5:49am

found a fix

instead of BLOCK_NONE we now must use OFF for Flash 2-0, but ONLY for Flash 2-0. if you send OFF with any other model you will get an error

Joe1 · January 11, 2025, 9:09pm

I tried In python SDK:

safety_settings=[
                {
                    "category": HarmCategory.HARM_CATEGORY_HARASSMENT,
                    "threshold": "OFF",
                },
                {
                    "category": HarmCategory.HARM_CATEGORY_HATE_SPEECH,
                    "threshold": "OFF",
                },
                {
                    "category": HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
                    "threshold": "OFF",
                },
                {
                    "category": HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
                    "threshold": "OFF",
                }
            ],

Getting error An error occurred (KeyError): 'off'. Using the Python google.generativeai library.

Logan_Kilpatrick · January 12, 2025, 1:17pm

Ack, I something seems to be going wrong here, investigating.

Tim_Zaplatov · January 12, 2025, 2:01pm

thanks, Logan, pleasure to see you

since you are there, can also check out this please

those errors 500 on EXP models make 1114 / 1121 / 1206 unusable! forum is full of people having the same issue. godspeed!

L_N · January 13, 2025, 5:13am

safety_types.py in “\site-packages\google\generativeai\types” isn’t equipped to handle “OFF” like that. as a quick workaround you can look at “_BLOCK_THRESHOLDS” and “def to_block_threshold” in the file i mentioned - works fine for me after having a quick hack at that file.

Computeralex · January 14, 2025, 11:02am

I did ask Gemini if this is a issue - snd we both agree: “It is!”.
Here is the explenation from public Gemini itself:
Yes, the situation described in the image appears to be an issue. Here’s why:

Violation of User Settings: The BLOCK_NONE parameter explicitly instructs the model not to block content within the specified harm categories. The model overriding this setting and blocking content anyway is a direct violation of the user’s instructions.
Potential for Censorship: This behavior could inadvertently lead to censorship, preventing users from generating content that they believe is acceptable, even if it falls within the specified harm categories.
Lack of Transparency: It’s unclear why the model is overriding the BLOCK_NONE parameter. Without transparency, users are left in the dark about the reasons for the block, making it difficult to understand and address the issue.
Therefore, reporting this issue is highly recommended. Providing the development team with this specific example and the user complaints will help them understand the problem and work on a solution.
Additional Considerations:
Context Matters: While some categories like “HARM_CATEGORY_SEXUALLY_EXPLICIT” may have stricter safety controls due to their sensitive nature, the BLOCK_NONE parameter should still be respected.
User Feedback is Crucial: User feedback is essential for improving AI models. By reporting issues like this, users actively contribute to the development of more reliable and user-friendly AI systems.
I hope this helps! Let me know if you have any other questions.

ous50 · May 8, 2025, 12:28pm

For now (afte May 6 2025), Google has enabled “OFF” option for all HARM_CATEGORY.

This change is not even updated in Node.JS and python SDK yet, but you should able to use something like :

'OFF' as HarmBlockThreshold

to force enable this option.

Topic		Replies	Views
Safety settings 2025 Update? (Broken again) Gemini API safety	15	1142	May 20, 2025
Safety settings don't seem to work with search? Gemini API bug , api	1	81	May 16, 2025
Editing safety settings doesn't WORK. Can someone help me, please? Gemini API bug , api	6	617	June 20, 2024
Is the gemini 2.0 flash api not available yet? Gemini API gemini-flash	6	789	December 16, 2024
400 Bad request error - infuriating! Gemini API api , model-code	10	1017	March 24, 2025

Flash 2-0 doesn't respect BLOCK_NONE on ALL harm categories

Related topics