Deeper Convolutional Neural Networks and Broad Augmentation Policies Improve Performance in Musical Key Estimation

doi:10.5281/zenodo.5624477

International Society for Music Information Retrieval

Published November 7, 2021 | Version v1

Conference paper Open

Deeper Convolutional Neural Networks and Broad Augmentation Policies Improve Performance in Musical Key Estimation

Stefan A Baumann

In recent years, complex convolutional neural network architectures such as the Inception architecture have been shown to offer significant improvements over previous architectures in image classification. So far, little work has been done applying these architectures to music information retrieval tasks, with most models still relying on sequential neural network architectures. In this paper, we adapt the Inception architecture to the specific needs of harmonic music analysis and use it to create a model (InceptionKeyNet) for the task of key estimation. We then show that the resulting model can significantly outperform state-of-the-art single-task models when trained on the same datasets. Additionally, we evaluate a broad range of augmentation methods and find that extending augmentation policies to include a more diverse set of methods further improves accuracy. Finally, we train both the proposed and state-of-the-art single-task models on differently sized training datasets and different augmentation policies and compare the differences in generalization performance.

Files

000004.pdf

Files (578.1 kB)

Name	Size	Download all
000004.pdf md5:3828b70f5e0c91b4a99a77dbc7972eb0	578.1 kB	Preview Download

350

Views

261

Downloads

Show more details

	All versions	This version
Views	350	349
Downloads	261	260
Data volume	163.6 MB	163.0 MB

More info on how stats are collected....

DOI

Resource type

Conference paper

Publisher

ISMIR

Imprint

Proceedings of the 22nd International Society for Music Information Retrieval Conference, 42-49. Online.

Conference

International Society for Music Information Retrieval Conference (ISMIR 2021) , Online, November 7-12, 2021

Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: October 30, 2021
Modified: November 3, 2021

Deeper Convolutional Neural Networks and Broad Augmentation Policies Improve Performance in Musical Key Estimation

Creators

Description

Files

000004.pdf

Files (578.1 kB)