Detecting specific animal sounds - repurpose micro_speech?

Hello all,

I’m new here but working with a group trying to detect a specific animal sounds in the wild using ML on ESP32 devices.

Initially I thought that repurposing the micro_speech TFlite-micro example might be a good approach. It has an example optimized for the ESP32 & its using a trained model to recognize specific sounds. However the more I’ve researched the more it appears that the micro_speech example is (naturally enough of course) very specific to keyword detection & that the FFT code in the example contains optimizations for human speech.

I’d appreciate if anyone could suggest a better starting point or approach for this particular application.

Many thanks,

Owen

1 Like

Hi Eletronic_Consult, welcome to the TF Forum!

For what you are trying to do, I’d start from here:

  1. Sound classification with YAMNet  |  TensorFlow Hub
  2. Transfer learning with YAMNet for environmental sound classification

and if you want to deploy it on the edge (phones, microcontrollers):
3. Audio Classification | On-Device ML  |  Google for Developers

On these links, there’s a lot of great information to help you achieve what you want

1 Like

hello,
May I ask how to solve the problem of animal classification audio in the end? I am trying to study the use of micro_speech to detect environmental sound events in smart homes.
Thank you very much.

Hello, it’s been a while since I looked at this but it seemed that the micro speech model was insufficiently complex for animal sound recognition (at least in my use case). Depending on what device you intend to deploy your model to you may wish to look at Edge Impulse as a solution or perhaps apply transfer learning to a suitable model from Hugging face. Best of luck!