Acoustic Scene Classification: From A Hybrid Classifier to Deep Learning
Authors/Creators
- 1. Center for Research and Technology Hellas
- 2. De Montfort University
Description
This report describes our contribution to the 2017 Detection and
Classification of Acoustic Scenes and Events (DCASE) challenge.
We investigated two approaches for the acoustic scene classification task. Firstly, we used a combination of features in the time
and frequency domain and a hybrid Support Vector Machines -
Hidden Markov Model (SVM-HMM) classifier to achieve an average accuracy over 4-folds of 80.9% on the development dataset
and 61.0% on the evaluation dataset. Secondly, by exploiting data-
augmentation techniques and using the whole segment (as opposed
to splitting into sub-sequences) as an input, the accuracy of our
CNN system was boosted to 95.9%. However, due to the small
number of kernels used for the CNN and a failure of capturing the
global information of the audio signals, it achieved an accuracy of
49.5% on the evaluation dataset. Our two approaches outperformed
the DCASE baseline method, which uses log-mel band energies for
feature extraction and a Multi-Layer Perceptron (MLP) to achieve
an average accuracy over 4-folds of 74.8%
Files
DCASE2017Workshop_Vafeiadis_135.pdf
Files
(317.8 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:018d2ec24d0d161b19654c9ce96b5974
|
317.8 kB | Preview Download |