I’m new here but working with a group trying to detect a specific animal sounds in the wild using ML on ESP32 devices.
Initially I thought that repurposing the micro_speech TFlite-micro example might be a good approach. It has an example optimized for the ESP32 & its using a trained model to recognize specific sounds. However the more I’ve researched the more it appears that the micro_speech example is (naturally enough of course) very specific to keyword detection & that the FFT code in the example contains optimizations for human speech.
I’d appreciate if anyone could suggest a better starting point or approach for this particular application.
hello,
May I ask how to solve the problem of animal classification audio in the end? I am trying to study the use of micro_speech to detect environmental sound events in smart homes.
Thank you very much.
Hello, it’s been a while since I looked at this but it seemed that the micro speech model was insufficiently complex for animal sound recognition (at least in my use case). Depending on what device you intend to deploy your model to you may wish to look at Edge Impulse as a solution or perhaps apply transfer learning to a suitable model from Hugging face. Best of luck!