AugmentedNet: A Roman Numeral Analysis Network with Synthetic Training Examples and Additional Tonal Tasks

Néstor Nápoles López; Mark R H Gotham; Ichiro Fujinaga

doi:10.5281/zenodo.5624533

Published November 7, 2021 | Version v1

Conference paper Open

AugmentedNet: A Roman Numeral Analysis Network with Synthetic Training Examples and Additional Tonal Tasks

AugmentedNet is a new convolutional recurrent neural network for predicting Roman numeral labels. The network architecture is characterized by a separate convolutional block for bass and chromagram inputs. This layout is further enhanced by using synthetic training examples for data augmentation, and a greater number of tonal tasks to solve simultaneously via multitask learning. This paper reports the improved performance achieved by combining these ideas. The additional tonal tasks strengthen the shared representation learned through multitask learning. The synthetic examples, in turn, complement key transposition, which is often the only technique used for data augmentation in similar problems related to tonal music. The name "AugmentedNet" speaks to the increased number of both training examples and tonal tasks. We report on tests across six relevant and publicly available datasets: ABC, BPS, HaydnSun, TAVERN, When-in-Rome, and WTC. In our tests, our model outperforms recent methods of functional harmony, such as other convolutional neural networks and Transformer-based models. Finally, we show a new method for reconstructing the full Roman numeral label, based on common Roman numeral classes, which leads to better results compared to previous methods.

Files

000050.pdf

Files (773.1 kB)

Name	Size	Download all
000050.pdf md5:a229f85d69b203f1c81667117a10ef5b	773.1 kB	Preview Download

365

Views

283

Downloads

Show more details

	All versions	This version
Views	365	359
Downloads	283	276
Data volume	238.9 MB	233.5 MB

More info on how stats are collected....

DOI

Resource type

Conference paper

Publisher

ISMIR

Imprint

Proceedings of the 22nd International Society for Music Information Retrieval Conference, 404-411. Online.

Conference

International Society for Music Information Retrieval Conference (ISMIR 2021) , Online, November 7-12, 2021

Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: October 30, 2021
Modified: July 17, 2024

AugmentedNet: A Roman Numeral Analysis Network with Synthetic Training Examples and Additional Tonal Tasks

Creators

Description

Files

000050.pdf

Files (773.1 kB)