Using the thinking 0121 model for translation often results in a mix of original text and translated text

My novel translation tool demo is live and has millions of words of translation tasks every month. I have also been testing the latest Gemini model, but I have found some troublesome problems. The following post is one of them. Sorry for not providing feedback in time.

This test used the latest model in the Gemini 2.5 series (0520preview), but this issue exists in Gemini 2.0-2.5 as well.