Change in GEMINI 1.5 pro behavior?

Not sure if worth posting this but I swear I’ve seen a significant behavioral change in 1.5 pro this past week. We use 1 model call to lay out an outline for something then a 2nd call to 1.5 pro to add content around the outline. In the past week it only completes like 1/2 of the task. It is not reaching max_tokens.

It brings me back to the reports of OAI gpt being “lazy” all of a sudden about a year ago.

I can’t find anything ANYWHERE about this. Maybe is just me.

2 Likes

Can you confirm exactly which model you’re using? An example that demonstrated it would help as well.

1 Like

gemini-1.5-pro (even though I’ve experimented with flash to troubleshoot). We’ve been using for a couple months, then all of a sudden this week it started half-a$$ing it.

Here is an outline created by gemini-1.5-pro

when I pump it back into another call (using json mode btw to structure the output) it only used about 1,400 tokens (of the 8,192 I set) and only created content for handful of the sections.

I even have a “reviewer” persona where I put the content back into LLM and ask it to reconcile the outline and the content and this example summarizes nicely how it “quit”:

We are changing the code to make it an iterative process BUT that is beside the point. This had been working fine for couple months!

1 Like

I’m not doing much with gemini right now so I cant add any real value, however if it makes you feel any better, I know exactly what you mean. You cant actually say anything for 100% sure, but you can feel the model has changed. I tried desperately to work out what it was in that specific OAI case. I thought it may be offloading excess capacity to a different model entirely, (the results were that different ). I have a term ‘texture’ , and for me it refers to whats feels like the ‘personality’ of the model. The texture changes and thats as obvious a your best bud being ‘invaded by a body snatcher’ :wink:.

So no assist sorry, but you’re not crazy, okay :sweat_smile:.

OAI side, the seed parameter and some version metadata in the the response help to not go crazy. Do we know if there is anything similar in gemini.

This is also a reason why i think full transparency re the Prompt would be helpful, as it bugged me that it could have just been the overaching system prompt that changed.

but if its like the OAI incident, others will be feeling similarly and it will be tightened up at the model level, a fix for ‘lazyy’ - very data science sounding that .

sorry for so much blah- blah, but i fell your pain, good luck, it does come back imo.

1 Like