Scalable neural architectures for end-to-end environmental sound classification

Francesco Paissan; Alberto Ancilotto; Alessio Brutti; Elisabetta Farella

doi:10.5281/zenodo.6351853

Published March 14, 2022 | Version v1

Conference paper Open

Scalable neural architectures for end-to-end environmental sound classification

1. Digital Society (DiGis) center - Fondazione Bruno Kessler

Sound Event Detection (SED) is a complex task simulating human ability to recognize what is happening in the surrounding from auditory signals only. This technology is a crucial asset in many applications such as smart cities. Here, urban sounds can be detected and processed by embedded devices in an Internet of Things (IoT) to identify meaningful events for municipalities or law enforcement. However, while current deep learning techniques for SED are effective, they are also resource- and power-hungry, thus not appropriate for pervasive battery-powered devices. In this paper, we propose novel neural architectures based on PhiNets for real-time acoustic event detection on microcontroller units. The proposed models are easily scalable to fit the hardware requirements and can operate both on spectrograms and waveforms. In particular, our architectures achieve state-of-the-art performance on UrbanSound8K in spectrogram classification (around 77%) with extreme compression factors (99.8%) with respect to current state-of-the-art architectures.

Files

ICASSP2022_SEDwithPhiNets.pdf

Files (364.3 kB)

Name	Size	Download all
ICASSP2022_SEDwithPhiNets.pdf md5:b2b16e435b68a9c1fdfad2cf6a9f1ba3	364.3 kB	Preview Download

Additional details

Is published in: Conference paper: 10.1109/ICASSP43922.2022.9746093 (DOI)
Is supplemented by: Software: https://github.com/fpaissan/phinet_pl (URL); Dataset: https://urbansounddataset.weebly.com (URL)

European Commission
MARVEL – Multimodal Extreme Scale Data Analytics for Smart Cities Environments 957337

	All versions	This version
Views	198	198
Downloads	231	229
Data volume	88.9 MB	88.2 MB

Scalable neural architectures for end-to-end environmental sound classification

Files

ICASSP2022_SEDwithPhiNets.pdf

Files (364.3 kB)

Additional details

Related works

Funding

Scalable neural architectures for end-to-end environmental sound classification

Creators

Description

Files

ICASSP2022_SEDwithPhiNets.pdf

Files (364.3 kB)

Additional details

Related works

Funding