I want to use Gemini to identify the exact speaking time of each speaker in the audio, but the timestamp results it returns are incorrect. However, the speaker IDs and the recognized text are correct. Is there a mistake in my operation? Could anyone help me? Thank you.
Hello,
Welcome to the Forum!!
Could you please share you sample audio file, prompt and your code so that we can try to reproduce your issue and analyze it better?