Structured output - poor consistency

benniepie · March 15, 2025, 6:52am

I’ve thrown roughly 4000 comparable requests at the various Gemini models and found that every Gemini model since Gemini-1.5-pro-exp-0827 has been terrible at outputting a structured output consistently, and does not following instructions well. Terrible at timestamps too.

The following could provide structured output consistently - without fail

Gemini-1.5-pro-001
Gemini-1.5-pro-exp-0827 (since deprecated)

Every single ones of the following models was/is unrealiable:

Gemini-exp-1121
Gemini-exp-1206
Gemini-2.0-flash-exp
Gemini 2.0-flash-exp-thinking-1219
Gemini 2.0-flash-exp-thinking-0121
Gemini 2.0-pro-exp-0205

The “thinking” models have improved the output, but I’m still regularly getting really poor performance.

E.g. If I ask for a really simple output with XML tags. It will output 20 records correctly, e.g.

Topic title </topictitle
00:00-01:23

And then it will randomly alter the closing XML tags on, for example, the 21st record, e.g.:

Topic title - XML closing tag misspelled
00:00-01:23 - XML closing tag mismatch

I have:

given the model examples of what is correct
given the model specific examples of what is not correct (including the exact mistakes it currently outputs.)
Changed the way I’ve worded the prompt / Reiterated / Asked the model to think step by step (where it will tell me it will check for common mistakes but still outputs them)

Probably time for me to move on, I’ve given Google enough loyalty - I doubt there’s anyone reading the feedback or responding to it anyway.

Topic		Replies	Views
Gemini-exp-1206 feedback Gemini API feedback , gemini-flash	2	579	December 23, 2024
Feedback on Gemini 2.0 with Pydantic Gemini API feedback , gemini-api	1	195	January 3, 2025
⚠️ Gemini API Instability During New Model Releases Gemini API gemini-15 , feedback , api	3	509	June 17, 2025
Gemini Flash 2.0 is useless? Google AI Studio models , gemini-flash	5	1218	December 23, 2024
Gemini 2.0 flash - 1.5 pro Struggles with Basic Task Execution Gemini API gemini-15 , api , models	1	105	May 19, 2025

Structured output - poor consistency

Related topics