Audio classification using TensorFlow Lite YAMNet model

I’m testing out the audio classification using the sample app here:

However, when I test the audio recording, it doesn’t seem to classify the audio properly as conversation (even though there is a class for Conversation). I see it always as Speech and sometimes the score is very low even though there is a conversation that I recorded while running the app. Also, it seems to give other audio classifications with a low score like maybe “burping” or “breathing”, but have no relation to the audio that I recorded.

Does this mean that the Yamnet model is not very accurate, I see that it was trained using the AudioSet which contains YouTube videos of the various different audio classes.

Thanks for any help.