Sub-mW Keyword Spotting on an MCU: Analog Binary Feature Extraction and Binary Neural Networks
Creators
- 1. CT-irst Fondazione Bruno Kessler, Trento, Italy
- 2. Huawei Technologies, Com- puting Systems Laboratory, Zurich Research Center, Switzerland
- 3. ETH Zurich, Switzerland
- 4. ETH Zurich, Switzerland & University of Bologna, Italy
Description
Keyword spotting (KWS) is a crucial function enabling the interaction with the many ubiquitous smart devices in our surroundings, either activating them through wake-word or directly as a human-computer interface. For many applications, KWS is the entry point for our interactions with the device and, thus, an always-on workload. Many smart devices are mobile and their battery lifetime is heavily impacted by continuously running services. KWS and similar always-on services are thus the focus when optimizing the overall power consumption.
This work addresses KWS energy-efficiency on low-cost microcontroller units (MCUs). We combine analog binary feature extraction with binary neural networks. By replacing the digital preprocessing with the proposed analog front-end, we show that the energy required for data acquisition and preprocessing can be reduced by 29x, cutting its share from a dominating 85% to a mere 16% of the overall energy consumption for our reference KWS application.
Experimental evaluations on the Speech Commands Dataset show that the proposed system outperforms state-of-the-art accuracy and energy efficiency, respectively, by 1% and 4.3x on a 10-class dataset while providing a compelling accuracy-energy trade-off including a 2% accuracy drop for a 71x energy reduction.
Files
Keyword_Spotting_Cerutti.pdf
Files
(4.3 MB)
Name | Size | Download all |
---|---|---|
md5:816bddc869c9879164c669bb64027c50
|
4.3 MB | Preview Download |
Additional details
Related works
- Is published in
- Journal article: 10.1109/TCSI.2022.3142525 (DOI)
- Is supplemented by
- Dataset: https://www.tensorflow.org/datasets/catalog/speech_commands (URL)