Google seems to have implemented a separate mechanism for censorship in front of the Gemini model. We did extensive testing with an unreasonable censorship (Gemini gets blocked when translating into Latin - discussion here Latin language generation seems to be censored?). The model, Gemini, starts translating in streaming mode and from testing the few hundred bytes that get through, it is actually doing a very decent translation job. Then this other mechanism, which I suspect is a network appliance and definitely not as clever as the LLM, clamps its jaws and cuts it off, and switches the finishReason into OTHER.
Leaving these decisions to the data center operations staff is likely causing more brand damage to Google than their senior management realizes.