Inflated and misleading benchmark for 2.5 pro 0605?

Fr_L · June 6, 2025, 8:06pm

I’ve noticed that the 0605 model tends to think less and also generate shorter output compared with the previous 05 06 model.

Has anyone noticed similar issues?

Fr_L · June 6, 2025, 8:18pm

Example: when i asked the new model and the old model to create a comprehensive reading note of a given article, the old model outputs much more detailed content compared with the new one, which tends to skip sections and citations.

Fr_L · June 7, 2025, 4:47pm

Especially in my use case (academic text comprehension, summary , legal reasoning etc.) , 0605 seems to perform worse than 0506(e.g. new 0605 loses original text information in the summary produced, citing less detailed references; when doing legal analysis, missing legal facts that trigger liabilities etc.)

Clintin_Brummer · June 8, 2025, 3:18pm

Did you check the reasoning tab ? Also you can adjust the responses but, keep in mind it’s a thinking model rather than the previous version that focuses more intently in response generation

Fr_L · June 8, 2025, 3:42pm

Thank you for the suggestion. I indeed compared the reasoning tab and reasoning summary for both 0506 and 0605,

They have similar reasoning approaches, but 0605 just gives worse responses.

My confusion is that, if 0605 truly ‘thinks’ better, why would it miss legal facts analysis, and also missing sections in text summary ?

Now just waiting for LegalBench to publish the new benchmark and see…

Fr_L · June 9, 2025, 2:08pm

0605’s decreased performance issue seems to have been verified by:

Topic		Replies	Views
Gemini 2.5 Pro Preview 05-06 deprecation notice Gemini API announcement , gemini-2-5	28	3291	July 7, 2025
Gemini-2.5-pro-preview-06-05 seems worse than gemini-2.5-pro-preview-05-06 in academic writing Gemini API models , gemini-2-5	2	198	June 16, 2025
Has version 05-06 of 2.5pro been deprecated? Gemini API models , gemini_25_pro	2	78	July 9, 2025
Gemini 2.5-pro-preview-06-05 extremely slow Google AI Studio feedback , gemini-2-5	4	640	June 30, 2025
Gemini 2.5 Update Feedback: Reduced Depth in Processing & Overly Polite Tone Compared to Previous Version Google AI Studio feedback , prompt	2	419	May 8, 2025

Inflated and misleading benchmark for 2.5 pro 0605?

Related topics