Hello. I’m using gemini-1.5-pro-latest for a data labeling task. The input contains images+text and the output is classification. I carefully revised my prompt, and the accuracy was around 0.78 before yesterday.
However, after yesterday’s big update, when I rerun everything using the same setting, the accuracy dropped to around 0.44 suddenly. I suspect this is due to the model change yesterday.
How can I use the previous version of the model “gemini-1.5-pro-latest”?
GPT allows us to choose from different versions of the same model, but gemini 1.5 pro only has “latest”. I would like to go back to the previous one which gave me much higher accuracy.
The prompt is asking Gemini to be an expert in analyzing human gestures. It provides detailed descriptions and a few examples for each of 3 major gesture types. The task is to classify each gesture (based on video frames and speech text) with type and reasoning.
Please see the results comparison in the attached. It performed well before (replicated multiple times to tweak the prompt) but the accuracy decreased significantly (from 0.78 to 0.43) since yesterday.
Is there any way to use the previous version of gemini-1.5-pro-latest before May 14?
Yeah I’ll admit, this one is stumping me. Also it’s harder for me to read what’s in the actual cells, but I know that’s on me.
The only thing I could see here being an issue is
whereby the prompt tweaks might be causing issues, but still, 35% is one hell of a performance dip, even for messed up prompts.
OpenAI has been infamous for stealth tweaks to their models, and so far this might be evidence of one here, but not really a solution unfortunately.
And it has no issues actually seeing the video frames? You’ve proven that it’s seeing more than nothing I’m guessing?
There is still a lot to learn when it comes to interacting with multimodal models because these things are so new. I wish I could provide more help . I really don’t understand why this occurred, but the data is right there proving it.
Working with frontier tech is both a blessing and a curse.
EDIT: Since the docs state you’ll eventually be able to specify specific stable versions in the future, rest assured you’ll be able to choose the best model that fits your use case over time.
One thing I noticed is that you mentioned using gemini-1.5-pro-latest. Go ahead and use gemini-1.5-pro instead. This is the stable version; the “latest” version is going to include a lot of hotfixes. That is unfortunately the best I can suggest atm.
Thanks so much Macha! I’ll definitely try gemini-1.5-pro to see if the stable version gives me better solution.
Re: prompt, it remains exactly the same in replication. I always run (1) a “stable” prompt and (2) a “new” prompt with some tweaks. The 35% drop is based on using the same “stable” prompt and the same setting for everything.
Re: seeing the video frames, I’m still working on this. Asked Gemini to output “frame description”: “briefly describe your observation for each frame, eg, img 1: … ; img 2…” but seems like some descriptions are completely off. I wonder if you have any better suggestions to best test this? Not sure if it’s because asking too many outputs may cause problems.
Thank you for the additional edits! It would be great if we can specify specific stable versions in the future. I was hoping May 14 will release the stable version (they said they will start billing on May 14 for the paid API), but surprised they delayed it (again) to May 30. Wondering if you know why?
I not only assume, but I am sure that he has become much dumber, but still I think that this is a bug that will be fixed within 1-2 weeks. However, if they don’t fix it, it will be very strange.
On the mathematical capabilities, I can confirm that the newly released 1.5 model degraded compared to the predecessor:
And as to whether our concerns are getting attention, I am worried that a flood of questions about the developer competition is depriving all the oxygen, and nobody from Google is listening to us.
Thank you for sharing this! It is really disappointing. I switched from GPT to Gemini because of its better visual understanding. But this recent upgrade is really a bummer. I’ll keep posting and hopefully someone will pay attention. Maybe they will finally catch up when they start billing on May 30…
I just tried it and seems like it only has gemini-1.5-pro-latest instead of gemini-1.5-pro. Maybe I missed something, is there any way to use gemini-1.5-pro?
Thank you! Seems like gemini-1.5-pro provides only the “latest” version… Google should at least provide both “old” and “new” versions if it’s really a big update (e.g., gpt4 has so many different versions to choose, with dates in names)
I was looking at these docs when trying to answer your question:
In there, they state the latest stable version is gemini-1.5-pro, but as @OrangiaNebula points out, that’s I guess not on the list using list_models(). So, looks like we found a discrepancy between what’s in the docs and what’s currently out there.
Gemini models are available in either preview or stable versions.
This suggests, in typical Google confusing documentation, that there is no stable version of Gemini 1.5. So only the model with the -latest suffix is valid. Even tho the documentation says what the stable version shortcut name would be.
Note that they don’t list a version with a version number suffix, which would indicate a stable version.
This is also why there is an unexpected change. They don’t say it’s stable.
Thank you for helping connect with the right person!
Really hope they could fix the problem soon I preferred Gemini over GPT in the last couple of weeks due to its better performance in visual understanding/reasoning. This recent major update on May 14 is really disapointing
@Logan_Kilpatrick Hi Logan! Hope all is well! Just saw your reply in another post which is super helpful and thought you might be the right person to ask. Wondering if you could help me with this question here about Gemini when you get a chance.