Published November 20, 2017 | Version v2
Conference paper Open

Acoustic Scene Classification: From A Hybrid Classifier to Deep Learning

  • 1. Center for Research and Technology Hellas
  • 2. De Montfort University

Description

This report describes our contribution to the 2017 Detection and
Classification of Acoustic Scenes and Events (DCASE) challenge.
We investigated two approaches for the acoustic scene classification task. Firstly, we used a combination of features in the time
and frequency domain and a hybrid Support Vector Machines -
Hidden Markov Model (SVM-HMM) classifier to achieve an average accuracy over 4-folds of 80.9% on the development dataset
and 61.0% on the evaluation dataset. Secondly, by exploiting data-
augmentation techniques and using the whole segment (as opposed
to splitting into sub-sequences) as an input, the accuracy of our
CNN system was boosted to 95.9%. However, due to the small
number of kernels used for the CNN and a failure of capturing the
global information of the audio signals, it achieved an accuracy of
49.5% on the evaluation dataset. Our two approaches outperformed
the DCASE baseline method, which uses log-mel band energies for
feature extraction and a Multi-Layer Perceptron (MLP) to achieve
an average accuracy over 4-folds of 74.8%

Files

DCASE2017Workshop_Vafeiadis_135.pdf

Files (317.8 kB)

Name Size Download all
md5:018d2ec24d0d161b19654c9ce96b5974
317.8 kB Preview Download

Additional details

Funding

European Commission
ACROSSING - Advanced TeChnologies and PlatfoRm fOr Smarter ASsisted LivING 676157