Urgent Feedback & Call for Correction: A Serious Breach of Developer Trust and Stability (Update: Google Formally Responds - 8 days later)

H_Express · May 8, 2025, 1:11am

I’m posting this today because, like many others in the developer community, I’m frankly shocked and deeply frustrated by Google’s unexpected decision to redirect the “gemini-2.5-pro-preview-03-25” endpoint. This endpoint, clearly labeled with a specific date, suddenly and silently points to the newer “gemini-2.5-pro-preview-05-06” model without any prior announcement or communication to our community. This isn’t just a minor inconvenience; it undermines our trust in the platform, disrupts workflows we’ve carefully built around this endpoint, and honestly leaves us questioning the decision-making that led to such a confusing move.

To be clear, we fully acknowledge Google’s stated documentation about “preview” and “experimental” models being subject to change or removal without notice. But there’s an important gap in this policy: your documentation never addresses specifically dated endpoints. When an endpoint explicitly includes a specific date like “03-25,” the natural, logical assumption from widespread industry norms set by OpenAI, Anthropic, and others, is that it represents a stable and immutable snapshot. The whole point of assigning a clear date label is to signal stability, predictability, and consistency, even if it’s officially still called a “preview.”

This silent redirection has resulted in widespread disruption. Many developers are noting and reporting clear and tangible differences in model performance - not just subtle tweaks, but significant regressions in reasoning abilities, major shifts in style and tone, and measurable changes across well-tested prompts. Entire prompting strategies, applications, and workflows that used to rely consistently on the March 25 checkpoint now suddenly break or behave unexpectedly. Even worse, public benchmarks and evaluations conducted in good faith are now unintentionally misleading or outright incorrect, since they’re unknowingly comparing completely different model versions than their labels suggest.

The resulting confusion has been immediate and widespread. Just go to Reddit, Discord, X, or other forums and you’ll find countless confused and frustrated developers, benchmark maintainers, and researchers struggling to understand what’s going on. The confusion is real, messily public, and incredibly frustrating. While we appreciated the minor clarifications provided by @Logan_Kilpatrick it unfortunately didn’t fully clear things up or directly address the core problem: the unexpected redirection of a clearly dated endpoint.

We completely understand that preview models inherently carry some risk, everyone gets that clearly. But using a clearly labeled date in an endpoint’s name establishes an expectation across our industry. If a model has an explicit date, it logically implies it’s preserved as an immutable checkpoint. That’s exactly how everyone expects dated model naming to behave, preview or not.

This silent endpoint swap hasn’t just confused everyone; it’s led to genuine anger and justified feelings that developers and researchers have been disregarded and blindsided.

With that in mind, we’re strongly urging Google to take direct, concrete steps toward restoring the trust that’s been damaged. In practical terms, we specifically ask that you:

Immediately and publicly restore the original March 25 checkpoint to its matching dated endpoint (gemini-2.5-pro-preview-03-25).
If that restoration simply isn’t feasible due to resource concerns, increased confusion, etc., then we urgently need a public clarification and firm commitment moving forward: all clearly dated model checkpoints must permanently remain stable and immutable snapshots representing their stated dates. Only endpoints explicitly labeled “latest” (such as gemini-2.5-pro-preview-latest) should ever redirect or be updated without explicit notice.

Again…we fully accept general risks associated with preview and experimental models. We’re simply asking Google to explicitly adopt and commit to the widely understood and logical industry-standard schema here: dates clearly signal immutability, while only explicitly labeled “latest” endpoints represent versions subject to change or redirection.

That’s it. That’s all we’re asking.

Signed,

A Developer Building on Gemini

shapip · May 8, 2025, 2:54am

The 2.5 Pro Preview API has basically stopped working in my application today. It returning a message that the output token count has been exceeded when the output should be no where close to that. I think this may be due to today’s update using more thinking tokens which are being charged to the output tokens, but, at least for me, i was very happy with the 3-25 output and don’t want additional thinking time which is slowing my application down anyway, not to mention that it’s completely broken because of the token limit being exceeded. I also would love for google to restore the 3-25 identifier back to that model which was my favorite model (once I worked around an automatic caching “feature” that seemed to have been added to the api about 2 weeks ago).

Ed_Godshaw · May 8, 2025, 5:03am

I second this. 03-25 was working beautifully and they pulled it with zero day’s notice.

05-06 is a significant regression - it eats up thinking tokens so badly that requests have gone from 20 seconds 5m+ which has left my application in a broken state.

There’s also a critical bug where the thinking tokens exceed the max completion tokens and it errors.

how did google release a model with such glaring issues
worse why have they rug pulled a versioned checkpoint

Gemini 2.5 pro OCR capabilities are nothing short of magical “Modified by moderator”

H_Express · May 8, 2025, 5:13pm

@Ed_Godshaw, your points about the 05-06 model’s performance regressions and bugs are being felt by many, and that’s valid feedback for Google on that specific checkpoint.

HOWEVER, and this cannot be stressed enough for everyone reading and for Google, the critical issue at the heart of this discussion, and the one that represents an unprecedented breach of developer trust, is the silent redirection of the gemini-2.5-pro-preview-03-25 endpoint. While the 05-06 model having its own set of problems is frustrating for sure, that’s a separate matter of model quality. the primary, urgent concern here is that a supposedly stable, dated checkpoint was changed underneath us without warning, violating established industry protocols and our fundamental expectations for API stability. This is the dangerous precedent we are focused on, and it’s vital to keep this thread centered on that specific action and the systemic damage it does to the Gemini API as a trustworthy development platform.

Google employees are typically very responsive on these forums; the noticeable silence on this specific thread for 24 hours is telling. This, especially when combined with the similar silence from @OfficialLogan on X and other platforms despite direct inquiries, hasn’t gone unnoticed by the community.

I can only speculate on the team’s rationale here. My guess is they wanted developers on the new checkpoint to do aggressive testing and iteration, but realized they had backed themselves into a corner because everyone was on a clearly dated model name.

So their options were to:
A) Deactivate that endpoint and break everyone’s API calls, or
B) Redirect the model name in an unprecedented breach of longstanding industry protocol and their own protocols, solving their internal problem, but creating a massive external problem.

They chose option B, and the fallout is clear. We’re even seeing knock-on effects like the AIDER benchmark team admitting they made a mistake in calculating the widely publicized cost figures for the original 03-25 Gemini 2.5 Pro (figures Google themselves highly promoted), but they are now unable to correct this because of the redirection issues. This kind of chaos is a direct result of Google’s decision.

Google has a choice here:

Publicly get ahead of the backlash, admit this wasn’t ideal, and give a public commitment going forward to clarify the exact policy around dated checkpoints in Preview and Experimental tiers, and rectify the current situation.
They can continue to bury this under the rug and hope it goes away, and go the way of the very recent public OpenAI blunders.

If they go with option 2, they will not have a good outcome. The trust being eroded is a valuable asset.

Even right now, a major public AI influencer and researcher, Simon Willison, co-creator of the Django Web Framework (someone Google themselves has featured in interviews with @Logan_Kilpatrick who has notably ignored emails and public calls for action on this, perhaps being directed from above), has just come out and publicly condemned Google’s actions here and their choice to break trust by redirecting the fixed dated checkpoint. I’ve been in touch with a number of AI influencers on YouTube and X, and most of them plan to run with this story and blow it up.

To any Google employees reading this: this matter needs to be escalated. A public statement, a policy clarification, and ideally, a course correction are needed ASAP. This is the first time a major AI provider has broken this longstanding implicit policy, and the precedent it sets is very concerning. We need to know if we can trust dated model identifiers from Google, or if “preview” now just means “volatile.”

shapip · May 8, 2025, 6:42pm

I agree google shouldn’t do that. In case anyone isn’t sure here’s a json returned when my app called the 3-25 model identifier
“usage_metadata”: {
“prompt_token_count”: 8920,
“total_token_count”: 17112
},
“model_version”: “models/gemini-2.5-pro-preview-05-06”

you can see the model return is the 05-06 preview. Also the token count shows that the internal processing seems to be what triggered the max output token error (subject the tokens and it is at the output limit).

synapse_tej · May 8, 2025, 8:36pm

To be clear, we fully acknowledge Google’s stated documentation about “preview” and “experimental” models being subject to change or removal without notice.

Well there you have it. Preview models can be deprecated without warning, because they are… preview models

H_Express · May 8, 2025, 8:43pm

@synapse_tej , it seems you may have stopped reading after the general statement about “preview models can change.” The core issue here isn’t that preview models can change. It’s that a specifically DATED preview-03-25 checkpoint was SILENTLY REDIRECTED to an entirely different, behaviorally distinct model (05-06).

This action violates long-standing industry conventions (and Google’s own past practices) where a dated identifier implies a fixed snapshot. While we understand deprecations can happen (ideally with notice), a silent redirection of a dated checkpoint is a fundamental breach of trust.

The problem is the lack of a clear policy for dated snapshots within the “preview” tier and the violation of the logical expectation that a model named for a specific date will actually be that model. That’s what needs to be addressed.

synapse_tej · May 8, 2025, 9:01pm

preview-03-25 was never a stable model. I fully understand what you’re saying, but the expectation that preview models should conform to stable standards is completely unrealistic. The well-established precedent is that preview models graduate and repoint to newer-dated preview models with no notice, because they are experimental in nature.

wowitsjack · May 8, 2025, 9:13pm

This is beyond unacceptable.

We have ACTUAL VICTIMS being harmed now. Rape vicitms. Being told to piss off by the new Gemini update.

0506Lobotomy · May 8, 2025, 9:30pm

05-06 is a disaster. please bring 03-25 back. I NEED IT

H_Express · May 8, 2025, 9:30pm

@synapse_tej, let’s be clear: no one is arguing that preview-03-25 was a “stable” GA model with full production guarantees. That’s a distraction from the real issue.

The core problem is your assertion that a “well-established precedent” exists for silently repointing specifically dated snapshot identifiers like gemini-2.5-pro-preview-03-25 because they are “experimental in nature.” That claim, when applied to dated snapshots, is, frankly, patently false. Yes Google has silently updated undated experimental or preview tags in the past to direct to new models, but to our collective knowledge, this has NEVER happened with any Google Generative AI model bearing a specific date in its identifier, whether tagged “preview” OR “experimental.” It certainly hasn’t happened in the broader industry, where OpenAI largely set the standard for respecting the integrity of dated model snapshots.

This action by Google isn’t just a deviation; as far as the community can tell, it’s entirely unprecedented for a dated checkpoint and breaks with Google’s own historical practices for its Gen AI API until now regarding such specific identifiers.

The expectation for a DATED identifier like preview-03-25 is not for full GA stability, but for that tag to consistently point to the specific historical version of the model as it was on March 25th. Silently changing what that specific, dated endpoint points to (especially to a behaviorally distinct model) is NOT a developer-friendly precedent; no one could seriously argue it is. The trustworthy approach is transparent deprecation of an old dated tag and the introduction of any new model under its own new dated tag. If preview-MM-DD doesn’t guarantee it points to the MM-DD snapshot, the dating convention itself becomes actively misleading, pretty much meaningless, and honestly simply flies in the face of basic logic and common sense. If you call it a specific date, that’s what developers expect to get.

So again…the fundamental problem isn’t that preview models evolve; no one disputes that. It’s that a specific, historical, DATED identifier was effectively commandeered, breaking the implicit contract of what that identifier represented, all without clear, proactive communication. This is what shatters developer trust and that is the problem here.

wowitsjack · May 8, 2025, 9:55pm

Bro it’s hit the news. You’re mentioned.

H_Express · May 8, 2025, 10:02pm

Glad it’s getting attention. Google actively ignoring this thread and all the others is still telling. Like I warned earlier, “RedirectGate” is coming. It seems Google has gone the way of OpenAI and chosen to sweep under the rug instead of acknowledging the mistake and clarifying the policy. It will backfire the way the recent “sycophancy” issue did for OpenAI. Makes zero sense for Google not to get ahead of the backlash, but instead, they are going to wait and react.

Jackten · May 9, 2025, 2:04am

Yes so bad, hate the new model, hate the way it was implemented, just a massive failure no matter how you look at it.

Have to switch back to o3 to get clear of this mess

Marinara · May 9, 2025, 7:03am

I am not a developer, but a semi-popular in certain communities prompt engineer and an AI consultant, who’s been a faithful Gemini user and fan since the August of 2024.

I’ve been recommending Gemini to all of my clients, but I can no longer do it with clear conscience. The new Pro 05-06 is borderline unusable, especially on higher contexts (and by higher, I mean above 32k). It feels dumber, doesn’t remember what you said two messages ago, and hallucinates a lot.

I hate it. Everyone in the community hates it. Even my developers friends hate it, and this update was introduced with them in mind. The benchmarks aren’t looking too good either.

Updates and experiments are needed. But not all will work out, and that’s okay, until you’re forcing everyone to use the failed outcome.

Also, if anyone from Google is actually reading this — please, don’t forget that coders aren’t the only people using your models. I can guarantee more use it as an assistant model, and I personally use it solely for creative writing and roleplaying (outside work stuff). Thank you.

Sina_Azizi · May 9, 2025, 3:30pm

The new model is purely trash.

It thinks too long, it yaps too much and can’t even give a coherent response. It is literally only good for UI design with HTML/CSS.

In an ideal world, this is fine, models come and go. But updating the other model without any notice to a model that shows clear regression in every area except UI design is an absolute joke.

Please provide developers options to choose an older model with clear deprecation notices.

H_Express · May 9, 2025, 5:00pm

@Sina_Azizi, @Marinara, @Jackten

I appreciate you all adding your voices to this conversation. Clearly, this thread has gotten a lot of attention both on social media and in traditional news outlets, and I completely understand the frustration many of you are feeling about the quality and performance regressions in the new model. I share many of those same concerns myself.

But I want to be 100% clear: the ONLY purpose of this particular thread is to seek policy clarification from Google regarding dated endpoints. While your feedback on the new model’s performance is valuable, and I encourage you to share it (perhaps in a new, dedicated thread), this specific discussion needs to stay laser-focused on the trust and stability implications of the endpoint redirection.

To confidently build on Gemini going forward, developers need clear expectations. We need to know that clearly dated endpoints won’t suddenly change out from under us. If an endpoint is explicitly labeled with a date, developers naturally assume it represents a stable snapshot. If Google intends otherwise, we need that explicitly clarified.

So again, to be crystal clear, what we’re asking Google for here is straightforward policy clarification:

Moving forward, are clearly dated endpoints always subject to silent redirection or change, despite the implicit stability communicated by the date?
Does this policy apply equally to Experimental, Preview, and GA categories, or are there differences?

This clarification is critical for developers to trust and rely on Gemini as a stable platform.

mrachid · May 9, 2025, 6:56pm

As a developer, I would rather have the endpoint be offline and then immediately find out that my app is down, than having to debug for hours trying to understand why it’s behaving differently. Worse yet if users are testing the app.

Brett · May 9, 2025, 8:27pm

What did y’all think “not for production” meant ? vibes? papers? essays?

H_Express · May 9, 2025, 8:58pm

@Brett, we all get that ‘preview’ isn’t meant for critical production systems. But the real frustration here is specifically about endpoints clearly labeled with dates.

If ‘03-25’ doesn’t reliably point to the exact model from March 25th, if it can silently become the May 6th model overnight, then logically, what would be the purpose of Google putting a date in the endpoint at all?

Consider public AI benchmarks, which drive community understanding and comparisons. These benchmarks almost EXCLUSIVELY rely on preview or experimental tiers, because GA models often lag months or even a year behind state-of-the-art. Researchers and developers depend on these dated preview checkpoints as stable snapshots for reproducible research and fair comparisons. Until now, the universal industry understanding, including Google’s own past GenAI models, was that a date in the identifier meant EXACTLY that: a stable snapshot as of that date.

Imagine a researcher meticulously benchmarks gemini-2.5-pro-preview-03-25, publishes their findings, and then discovers that Google had silently redirected that same endpoint to the 05-06 model prior to the run. Their work isn’t just invalid; it’s unintentionally misleading, and they’d have no way of knowing unless they stumbled onto this thread. This actively sabotages the entire benchmarking and evaluation process.

Something similar just played out publicly, as AIDER has issued a retraction stating their findings for 03-25 were mistaken, and they now have no way to go back and correct their results because of this silent change. (Results Google had widely publicized and promoted themselves.)

Given all this, I’m asking you now to answer this question candidly:

Do you, @Brett, genuinely see no value whatsoever in Google clarifying its policy on whether dated preview endpoints represent stable snapshots, or if they’re now subject to silent redirection?

Wouldn’t that clarity benefit everyone trying to build, test, or even just understand these models?

Could you honestly say NO to that?

Topic		Replies	Views
You Gagged Our App for Trauma Survivors. Now Real People Are Suffering Gemini API api , models	3	601	May 9, 2025
Gemini 2.5 Pro Preview 05-06 deprecation notice Gemini API announcement , gemini-2-5	22	2235	June 24, 2025
Gemini 2.5 Pro Preview is very bad! Google AI Studio api , models	25	3489	May 29, 2025
2.5 pro just started hallucinating Gemini API models	12	975	June 2, 2025
Gemini-1.5-pro-latest performs WORSE since yesterday. How to use its previous version? Gemini API	35	864	September 2, 2024

Urgent Feedback & Call for Correction: A Serious Breach of Developer Trust and Stability (Update: Google Formally Responds - 8 days later)

Related topics