Published November 7, 2021 | Version v1
Conference paper Open

Deeper Convolutional Neural Networks and Broad Augmentation Policies Improve Performance in Musical Key Estimation

Description

In recent years, complex convolutional neural network architectures such as the Inception architecture have been shown to offer significant improvements over previous architectures in image classification. So far, little work has been done applying these architectures to music information retrieval tasks, with most models still relying on sequential neural network architectures. In this paper, we adapt the Inception architecture to the specific needs of harmonic music analysis and use it to create a model (InceptionKeyNet) for the task of key estimation. We then show that the resulting model can significantly outperform state-of-the-art single-task models when trained on the same datasets. Additionally, we evaluate a broad range of augmentation methods and find that extending augmentation policies to include a more diverse set of methods further improves accuracy. Finally, we train both the proposed and state-of-the-art single-task models on differently sized training datasets and different augmentation policies and compare the differences in generalization performance.

Files

000004.pdf

Files (578.1 kB)

Name Size Download all
md5:3828b70f5e0c91b4a99a77dbc7972eb0
578.1 kB Preview Download