I’m testing out the audio classification using the sample app here:
However, when I test the audio recording, it doesn’t seem to classify the audio properly as conversation (even though there is a class for Conversation). I see it always as Speech and sometimes the score is very low even though there is a conversation that I recorded while running the app. Also, it seems to give other audio classifications with a low score like maybe “burping” or “breathing”, but have no relation to the audio that I recorded.
Does this mean that the Yamnet model is not very accurate, I see that it was trained using the AudioSet which contains YouTube videos of the various different audio classes.
Thanks for any help.