Published September 21, 2025
| Version v1
Conference paper
Open
Scaling Self-Supervised Representation Learning for Symbolic Piano Performance
Authors/Creators
Description
We study the capabilities of generative autoregressive transformer models trained on large amounts of symbolic solo-piano transcriptions. After first pre-training on approximately 60,000 hours of music, we use a comparatively smaller, high-quality subset, to fine-tune models to produce coherent musical generations, perform symbolic classification tasks, and by adapting the SimCLR framework to symbolic music, produce general purpose contrastive MIDI embeddings. The resulting models perform well on a variety of standard benchmarks, demonstrating the generalizability of the autoregressive representations learned during pre-training, often requiring only a few hundred gradient updates to fully specialize to different generative and MIR tasks.
Files
000052.pdf
Files
(549.1 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:a0a5568fe45a0c52579b5a8b4cd0f5a2
|
549.1 kB | Preview Download |