Published November 10, 2024 | Version v1
Conference paper Open

RNBert: Fine-Tuning a Masked Language Model for Roman Numeral Analysis

Creators

Description

Music is plentiful, but labeled data for music theory tasks like roman numeral analysis is scarce. Self-supervised pre-training is therefore a promising avenue for improving performance on these tasks, especially because, in learning a task like predicting masked notes, a model may acquire latent representations of music theory concepts like keys and chords. However, existing models for roman numeral analysis have not used pretraining, instead training from scratch on labeled data, while conversely, pretrained models for music understanding have generally been applied to sequence-level tasks requiring little explicit music theory, such as composer classification. In contrast, this paper applies pretraining methods to a music theory task by fine-tuning a masked language model, MusicBERT, for roman numeral analysis. We apply token classification to get a chord label for each note and then aggregate the predictions of simultaneous notes to achieve a single label at each time step. The resulting model substantially outperforms previous roman numeral analysis models. Our approach can readily be extended to other note- and/or chord- level music theory tasks (e.g., nonharmonic tone analysis, melody harmonization).

Files

000089.pdf

Files (295.5 kB)

Name Size Download all
md5:be3e20cb65d01feb7afd91016302ed5d
295.5 kB Preview Download