Audio Content Analysis for Unobtrusive Event Detection in Smart Homes

Anastasios Vafeiadis; Konstantinos Votis; Dimitrios Giakoumis; Dimitrios Tzovaras; Liming Chen; Raouf Hamzaoui

doi:10.1016/j.engappai.2019.08.020

Published April 22, 2020 | Version v1

Preprint Open

Audio Content Analysis for Unobtrusive Event Detection in Smart Homes

1. Center for Research and Technology Hellas
2. De Montfort University

Environmental sound signals are multi-source, heterogeneous, and varying in
time. Many systems have been proposed to process such signals for event detec-
tion in ambient assisted living applications. Typically, these systems use feature
extraction, selection, and classification. However, despite major advances, sev-
eral important questions remain unanswered, especially in real-world settings.
This paper contributes to the body of knowledge in the field by addressing the
following problems for ambient sounds recorded in various real-world kitchen
environments: (1) which features and which classifiers are most suitable in the
presence of background noise? (2) what is the effect of signal duration on recog-
nition accuracy? (3) how do the signal-to-noise-ratio and the distance between
the microphone and the audio source affect the recognition accuracy in an en-
vironment in which the system was not trained? We show that for systems
that use traditional classifiers, it is beneficial to combine gammatone frequency
cepstral coefficients and discrete wavelet transform coefficients and to use a gra-
dient boosting classifier. For systems based on deep learning, we consider 1D
and 2D Convolutional Neural Networks (CNN) using mel-spectrogram energies
and mel-spectrograms images as inputs, respectively, and show that the 2D
CNN outperforms the 1D CNN. We obtained competitive classification results
for two such systems. The first one, which uses a gradient boosting classifier,
achieved an F1-Score of 90.2% and a recognition accuracy of 91.7%. The second
one, which uses a 2D CNN with mel-spectrogram images, achieved an F1-Score
of 92.7% and a recognition accuracy of 96%.

Files

Engineering_Applications_Elsevier.pdf

Files (2.5 MB)

Name	Size	Download all
Engineering_Applications_Elsevier.pdf md5:a9e8ab549829da1b5b66f6bc2fad8f8a	2.5 MB	Preview Download

Additional details

ACROSSING – Advanced TeChnologies and PlatfoRm fOr Smarter ASsisted LivING 676157: European Commission

	All versions	This version
Views	43	43
Downloads	188	187
Data volume	488.4 MB	485.9 MB

Audio Content Analysis for Unobtrusive Event Detection in Smart Homes

Creators

Description

Files

Engineering_Applications_Elsevier.pdf

Files (2.5 MB)

Additional details

Funding