Why expose safety controls in AI Studio if they don’t actually work? This feels like a broken feature

I need to raise a very serious usability issue with Google AI Studio.

The platform exposes safety configuration controls in the interface. Users can open Run safety settings and explicitly adjust categories like:

  • Harassment

  • Hate

  • Sexually Explicit

  • Dangerous Content

These controls clearly imply that the user can change moderation strictness depending on their use case.

However in practice these settings appear to have little or no real effect.

Even when categories are set to Off, the system still frequently returns:

:warning: Content blocked

So this leads to a very direct question:

What is the actual purpose of these controls if they don’t meaningfully affect moderation behavior?

From a user perspective this looks like a feature that exists in the interface but does not actually function.

That is extremely frustrating.

Many of us use AI Studio specifically for:

  • creative writing

  • long-form storytelling

  • roleplay scenarios

  • narrative experimentation

In these contexts users need predictable behavior from the tools they are given.

If the UI exposes safety configuration but the backend still overrides everything regardless of those settings, then the controls become misleading.

Users spend time adjusting them expecting a change in behavior, but the system continues blocking responses exactly the same way.

At that point the safety panel starts to feel less like a real configuration tool and more like a cosmetic UI element.

And that raises serious concerns:

  • Do these safety sliders actually influence the model output?

  • Are there additional hidden filters overriding them?

  • If moderation is enforced regardless of these settings, why expose them at all?

Users invest time and sometimes money into building workflows around AI Studio.

Providing configuration options that don’t actually change system behavior undermines trust in the platform.

If the controls are supposed to work - they need to work.

If they are not intended to affect moderation in the way the UI suggests - that needs to be clearly documented.

Right now the current behavior makes the safety configuration feel unreliable and misleading.

A clear explanation from the AI Studio team would be appreciated.

7 Likes

Please pay attention and provide your examples (how it was before and how it works now, as it seems that the head of Google AI Studio does not know the essence of the problem)

2 Likes

The safety filter we show in the UI are a very old set of safety filters. The reason they exist is to serve as a fall back mechanism for early versions of Gemini (like Gemini 1). These are no longer applicable in most cases which is why we have them under the advanced settings button.

But the system itself remained and was not deleted. Why leave something that’s not relevant? I don’t want to offend anyone, but it seems to me that before the changes in the security system, the sphere of creating creative text worked satisfactorily. Yes, there were some problems, but since the opening of this platform, users have become accustomed to it. Why break something that was already working fine? If the problem is computing power or something else, say so, rather than just saying that creative writing doesn’t matter. In fact, creative writing demonstrates the ability of AI to reason, think logically, work with a long context and with a person.

5 Likes

Then please make safety filters that we can really influence, so that we can adjust the boundaries of acceptable content for ourselves

7 Likes

I did do some digging into what those settings actually did. If my research is accurate, essentially, these are totally ignorable and you really don’t care about them.

They were just watchdog type guardrails that used independent classifier models to trigger a kill switch. It sounds like they don’t (and never have) had any impact on output.

It kind of looks like the “content block” result we observe is a wholly separate mechanism that is deliberately not user-configurable. I have noticed the model occasionally shoots itself with it while running complex coding prompts particularly when calling the code execution sandbox tool.

The thing you’re asking for is also kind of technically impractical. There aren’t many ways to bolt on user-configurable settings like that other than through system instructions, which you already have. That would be the place to add your constraints. It’s quite difficult to add system instruction level rules that don’t degrade your results with undesirable side-effects.