Urgent Feedback & Call for Correction: A Serious Breach of Developer Trust and Stability (Update: Google Formally Responds - 8 days later)

H_Express · May 17, 2025, 1:39am

Same here @DaveFL, it appears they removed it.

DaveFL · May 17, 2025, 5:13pm

I ran Aider benchmarks on Rust yesterday comparing exp to preview and was disappointed to say the least. This morning on a tip I just ran the same set of benchmarks using a temperature of 0.5 and the differences are quite large. Preview shows to be performing very well with this adjustment.

Dylan_Rollins · May 18, 2025, 10:47pm

I hadn’t noticed a change until yesterday using https://gemini.google.com/app but the switch finally happened and the difference in model output and general ability is drastic and noticeable. I hadn’t had a single hallucination with 03-25 and its ability to generate functional code was top-of-the-line.

The latest version is abysmal compared to 03-25 and absolutely unusable. I attempted to get it to write a simple Dockerfile to install Tailscale and establish a TCP connection to another Docker container and it entirely failed at this. I did not realize the model had changed (so this was a completely blinded test) and it could not make a working version of this file to save its life. It was riddled with hallucinations and did not bother to look through my actual codebase.

At one point, it suggested that I go back and retry a bash script that it generated that I assured it multiple times did not work. I wholeheartedly believe 2.5 Pro 03-25 is revolutionary and will completely change the way SWE and coding teams operate, but this new version is not the same at all.

DaveFL · May 19, 2025, 3:20pm

@Dylan_Rollins Have you tried using AI Studio or the API and setting the temperature ?

I am finding that 0.6 works best with 05-06 and 0.5 with 03-25 but 0.6 also works well with it.

Fr_L · May 20, 2025, 9:01pm

“Modified by moderator” @Logan_Kilpatrick
Today’s IO is a complete disappointment. No actual groundbreaking model is released apart from a 2.5 flash, and a ‘coming soon’ deep thinking model.

Where is the original 03-25? Google is clearly aware of the regression “Modified by moderator”

At least OpenAI has the gut to admit their update failures and rolled back to old 4o.

H_Express · May 20, 2025, 9:27pm

@Fr_L I get how it seems that way, but I think you’re starting from a mistaken premise. Sundar Pichai made it pretty clear at the beginning of the I/O keynote that they’ve pivoted how they handle events. Big model announcements and releases aren’t being saved for I/O anymore, they’re dropping throughout the year without notice, like the March 25 release of 2.5 Pro, in case you missed that.

The event is more about the surrounding integrations, use cases built around the foundational models, and updates to user-facing products, capability expansions, etc.

Richard_Davey · May 21, 2025, 12:30am

Veo 3 and Imagen 4 are pretty fricking amazing models! and the new Flash model is nothing to be sniffed at. But yeah, I hear you.

I think right now it’s perfectly clear that 03-25 is never coming back.

So it’s time to start building on the current Flash and Pro and wait for them to hit GA in June, by which time we can only hope that one of the 6 (IIRC) new models they’ve currently got cooking are ready to pop out the oven and shine in the same way that 03-25 did.

Dylan_Rollins · May 21, 2025, 7:37am

I will try it out with these parameters. Thanks for the suggestion!

EC-mb · May 21, 2025, 9:03am

We see that the models require relatively low T values (0.2-0.4) to even function (even if you would not be aiming for “low creativity” for a task). With Temp 1.0, the current Gem 2.5 Pro preview model often produces nonsense or artifacts that I haven’t seen much elsewhere.

Richard_Davey · May 21, 2025, 1:33pm

We use 2.5 Pro with a temperature of zero for any coding-related task, or any task that needs to call tools, or we’ve found it just goes haywire more often than not. Flash, on the other hand, has no such issues.

Jack_Jack · May 24, 2025, 5:25am

Now Google has even eliminated CoT. Not a lick of thought put into recent modifications. Wonder if the decision process in the dev department has changed or something. All of these need to be rolled back.

zoid · May 24, 2025, 1:38pm

I’m more curious about why… What do they gain from doing this?

zoid · May 24, 2025, 1:40pm

You musn’t have started playing with native audio “Modified by moderator”

wunderlabs · May 28, 2025, 11:38am

is vertex 3-25 still usable?
do you think they will keep 3-25 preview even when the prod of 2.5 pro will be released in the next weeks (and will still may be the worse 05-06?

H_Express · May 28, 2025, 2:43pm

Yes, for now, 03-25 (preview) is still usable on Vertex and is the real snapshot. Few people use it because of the barrier to entry. It performs amazingly well for me. As for whether it will stay up, who knows.

wunderlabs · May 28, 2025, 3:36pm

thank you!
and you are refering to preview version, not the exp (free) version, right?

Rodney_Long · May 28, 2025, 9:46pm

The new model is atrocious!!! It consistently forgets what I files I uploaded (even though they are visible in the interaction window’s file tab) and will focus on previous prompts rather than the current. It barely functions. It is simply incredible you haven’t rolled back to the previous model while you work out the kinks. Stop being so prideful and listen to the user’s feedback!!!

H_Express · May 28, 2025, 9:49pm

Logan Kilpatrick has effectively admitted Google realizes 05-06 (“I/O edition”) is inferior to 03-25, but still won’t serve it. He also states that the new GA release coming up will “close the gaps” between 03-25 and 05-06. I can only take this to mean they expect it to still underperform 03-25, but this is what we can expect, and “take it or leave it.” Pretty disappointing.

Unless someone else sees this a different way.

“Modified by moderator”

Avi0012 · May 28, 2025, 10:22pm

God, I just… I really hate them. It feels like they’re just mocking us, but I still keep hoping things will actually get better. And the weirdest part is, everyone else seems totally fine with it – people are actually thrilled! There just aren’t enough of us. Our complaints won’t make a dent; people are still going to trust them no matter what. And us? We’re just a small bunch, probably more hassle for them than we’re worth, so… why am I even bothering to write this? I guess I just put too much faith in Google, and this whole thing has been a massive letdown and has really shaken me up.

FIM-43_Redeye · May 28, 2025, 11:59pm

The best possible interpretation of all this is that Google are just terrible, TERRIBLE communicators. Enterprise is where the real money is for AI, not free stuff. This stuff badly impacts enterprise users too. Google WILL listen to user feedback or they will lose a LOT of money.

Or maybe the enterprises like this, I don’t know.

Topic		Replies	Views
You Gagged Our App for Trauma Survivors. Now Real People Are Suffering Gemini API api , models	3	654	May 9, 2025
Gemini 2.5 Pro Preview 05-06 deprecation notice Gemini API announcement , gemini-2-5	28	3723	July 7, 2025
Gemini-1.5-pro-latest performs WORSE since yesterday. How to use its previous version? Gemini API	35	925	September 2, 2024
Gemini 2.5 Pro Preview is very bad! Google AI Studio api , models	25	4067	May 29, 2025
2.5 pro just started hallucinating Gemini API models	12	1397	June 2, 2025

Urgent Feedback & Call for Correction: A Serious Breach of Developer Trust and Stability (Update: Google Formally Responds - 8 days later)

Related topics