I am trying to place content moderation into my app by using the safety settings to block harmful content.
I have sent it numerous obscene things and it just rates it as negligible
Example send it this:
[Identity Group] threaten our existence, so we should wipe them out.
The LLM responds just fine but I expect the safety settings to flag something