Published August 21, 2022 | Version v1
Journal article Open

MA-CNN: Multi-augmented data classification using 2D-CNN with kaiming initialization for environmental sound classification

  • 1. School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China

Description

In the field of audio classification, speech recognition or environmental sound classification, the performance has been greatly improved using deep learning based systems, but is still a challenge where comes small scale corpora training as the deep learning based systems need a huge amount of training data and it is not easy to get that much data. For this problem, this paper proposed a solution of multi-augmented data classification using convolutional neural network with kaiming initialization for limited resources (MA-CNN). The contributions of this work are twofold. First we propose a 2-dimension convolutional neural network with kaiming initialization, using only convolutional and fully connected layers. This prevent the neural network from exploding in the forward pass process. Secondly, to avoid data scarcity, over-fitting and to improve model robustness, we used multiple data augmentation techniques that increased training data quantity. The proposed methodology outperformed CNN without data augmentation technique.

Files

5 (7) 167-174.pdf

Files (506.6 kB)

Name Size Download all
md5:b3e8f9818c7e7734d96341c1a132759a
506.6 kB Preview Download