Characters with accents are generated in lots of wrong ways

Xavier_Van_Elsacker · May 5, 2025, 5:08pm

I’m using Gemini 2.0 flash with structured output in various languages. If the output contains characters with accents, as is e.g. common in French, these characters are sometimes generated correctly, but often generated in lots of different ways.

I’ve seen accented e’s replaced by \[e], so no information about the accent but at least the unaccented letter. I’ve seen accented letters or descriptions wrapped in \[\] (sometimes with backslashes, sometimes without), e.g. \[è\] or \[egrave]. I’ve seen them replaced by a number of \n’s or \t’s… so no letter and no accent. I’ve seen unexpected but identifiable encodings, e.g. \u00e9, \u00f4. But I’ve also seen these encodings with a \n instead of a \u… And I’ve seen the hex numbers without anything, so simply 00e9.

I think there were more that I didn’t document…

I try my best to reconstruct what I can… but obviously this doesn’t work when letters or information about accents are missing.

Has anybody else encountered this / found a way to reduce the chance of this going wrong?

Govind_Keshari · May 6, 2025, 4:55am

Hi @Xavier_Van_Elsacker, Welcome to forum!!

Thanks for reporting. Can you please check once with 2.5 pro model, is it the same case??
Else please share your schema and prompt if possible so i can repro the issue from my end.

Thanks

Xavier_Van_Elsacker · May 6, 2025, 1:53pm

Hi,

Thanks for getting back to me. I’ve tried to reproduce this using 2.5 pro using prompts that went wrong in the past (and when retrying using 2.0 flash today) as well as a random set of prompts.

For now all 2.5 pro output was OK, which seems promising. I’ll continue to test 2.0 problems.

Is the expectation that this will be fixed with 2.5 flash when it gets released as well? 2.5 pro is too slow for my use-case , so it’s not something I can switch to in general.

Govind_Keshari · May 7, 2025, 7:22am

That’s good to here. Yeah, pro models are little heavier you can see some latency.
Yes, definitely it will resolve in 2.5 Flash. We gather the feedback to improve the things in next version.

Xavier_Van_Elsacker · May 21, 2025, 2:45pm

I’ve now also tried with gemini-2.5-flash-preview-05-20. Where I did not see the strange encodings I got with 2.0, I do see the repetitions that are also mentioned elsewhere. These repetitions start when an accented character should be printed. I’ve seen “\n\t\t\t\t\t…”, “\r\n\r\n\r\n\r\n…”, “\n\n\n\n\n…” These repeat until max token count is reached.

Govind_Keshari · June 23, 2025, 7:32am

Hey @Xavier_Van_Elsacker,

2.5 Flash and Pro is now stable models and improved with the users feedback. Can you please check with these models if it’s the same case.

_Rohit · July 24, 2025, 9:12am

Hi @Govind_Keshari ,

I am using Google Flash 2.5, and still face this issue

Topic		Replies	Views
Random Endless \n Output in Gemini API 1.5 Pro Responses Gemini API gemini-15 , model	16	1270	August 8, 2025
Gemini 2.0 Flash fails to generate a structured output Gemini API bug , api , vertexai , gemini-20	3	437	July 30, 2025
"\n" issue when calling in Structured output mode Gemini API bug , issues	5	414	March 4, 2025
Unstable but reproducible constrained generation errors with gemini-2.5-flash-lite-preview-06-17 using very simple prompts Gemini API bug , api , gemini-flash	3	434	July 30, 2025
FEEDBACK: gemini-2.0-flash-thinking-exp-1219 switching language Gemini API api , models	5	333	December 26, 2024

Characters with accents are generated in lots of wrong ways

Related topics