Published September 21, 2025
| Version v1
Conference paper
Open
Improving BERT for Symbolic Music Understanding Using Token Denoising and Pianoroll Prediction
Authors/Creators
Description
In this work, we propose a pre-trained BERT-like model for symbolic music understanding that achieves competitive performance across a wide range of downstream tasks. To achieve this target, we design two novel pre-training objectives, namely token correction and pianoroll prediction. First, we sample a portion of note tokens and corrupt them with a limited amount of noise, and then train the model to denoise the corrupted tokens; second, we also train the model to predict the corresponding bar- and tatum-level pianoroll-derived representations from each token. We argue that these objectives guide the model to better learn specific musical knowledge such as pitch intervals. For evaluation, we propose a benchmark incorporating 12 downstream tasks ranging from chord estimation to symbolic genre classification. Results demonstrate the effectiveness of the proposed pre-training objectives on the majority of the downstream tasks.
Files
000051.pdf
Files
(428.8 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:0ae07cb62934cd682ed09e44cf9aea21
|
428.8 kB | Preview Download |