I’ve been using the Gemini Pro model for audio transcription tasks, and while the transcription quality is generally impressive, I’ve encountered significant issues with the accuracy of the generated timestamps.
However, we saw the new experimental LearnLM model is able to give proper timestamps for transcriptions.
Hi @Sahil_Alam, Welcome to the forum!
Which Gemini pro model version you were earlier using?? 001 or 002 ??
Hi @Govind_Keshari , we are using 002 version of gemini pro model
Hey @Sahil_Alam,
There is some issue with 002 version while doing audio transcription task, this is already escalated to the internal team. I recommend using 001 version as of now. If not using for production task, then keep on using experimental LearnLM model.
Thanks for the suggestion , it seems to be doing well with 001, is there a tracker where we can see the progress on this issue for the latest pro model.
Hey @Sahil_Alam ,
There is no tracker as such but you can follow the release notes for the updates here
Sure, thank you @Govind_Keshari for your help
Hi @Govind_Keshari, can you update if there’s still an issue with gemini pro 002 for audio transcription? If so, can you give more details about the issue?
Hey @Moe_K, Welcome to the forum !!!
Whatever the issue was in previous model, our team tries to resolve in the upcoming model. So, you can try recently released “2.0-flash-exp”.