How to Train Micro Speech Model with DepthwiseConv2D Instead of Conv2D?

Hi everyone,

I’m working with the micro speech example from the tflite-micro GitHub repository and have encountered an issue with the model architecture when training.

Observation:

  • The pre-trained micro_speech_quantized.tflite model (included in the micro_speech example) uses DepthwiseConv2D layers, as documented in train/README.md

  • However, when I train a new model using train_micro_speech_model.ipynb in Google Colab, the resulting model contains Conv2D layers instead of DepthwiseConv2D

Context: I believe this might be related to the TensorFlow version being used during training. Since DepthwiseConv2D is more efficient for microcontroller deployment, I’d prefer to train with that architecture.

Questions:

  1. What’s the recommended approach to ensure the trained model uses DepthwiseConv2D layers?

  2. Is there a specific TensorFlow version I should use with this notebook?

  3. Are there any configuration parameters or model architecture flags I should set?

Environment:

  • Notebook: train_micro_speech_model.ipynb from tflite-micro repo

  • Platform: Google Colab

  • TensorFlow version: [I can provide this if it helps]

I’d really appreciate any guidance on this. Thank you in advance for your help!