A Data-Driven Analysis of Robust Automatic Piano Transcription

Edwards, Drew; Dixon, Simon; Benetos, Emmanouil; Maezawa, Akira; Kusaka, Yuta

doi:10.5281/zenodo.10610212

Published February 2, 2024 | Version 1.0.0

Model Open

A Data-Driven Analysis of Robust Automatic Piano Transcription

1. Queen Mary University of London
2. Yamaha (Japan)

This is a re-trained model of [1], using the data augmentation techniques described in our pending IEEE Signal Processing Letters publication "A Data-Driven Analysis of Robust Automatic Piano Transcription".

MAPS test set out-of-dataset evaluation:

Model	Precision	Recall	F1
Hawthorne et al. [2]	87.5	85.6	86.4
Kong et al. [1]	78.3	87.2	82.4
Maman and Bermano [3]	88.2	86.5	87.3
Toyama et al. [4]	84.6	85.7	85.1
Ours	89.5	87.4	88.4

On the MAESTRO test set, we acheive a note onset of 96.6 F1 score, compared to 96.7 of Kong et al. Previous publications ([1], [2], [4]) train without any data augmentation as it has been shown to slightly hurt test set performance. We view this as plain overfitting and encourage future research to focus more on generalization than on in-distribution test set metrics.

Note: this checkpoint does not include pedal predictions, so you should use the Regress_onset_offset_frame_velocity_CRNN module when loading the weights.

Source code of the original model implementation is available at: https://github.com/bytedance/piano_transcription

[1] Qiuqiang Kong, Bochen Li, Xuchen Song, Yuan Wan,and Yuxan Wang, “High-resolution piano transcription with pedals by regressing onset and offset times,” IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 29, pp. 3707–3717, 2021.
[2] Curtis Hawthorne, Andriy Stasyuk, Adam Roberts, Ian Simon, Cheng-Zhi Anna Huang, Sander Dieleman, Erich Elsen, Jesse Engel, and Douglas Eck, “Enabling factorized piano music modeling and generation with the MAESTRO dataset,” in International Conference on Learning Representations, 2019.
[3] Ben Maman and Amit H. Bermano, “Unaligned supervision for automatic music transcription in the wild,” in International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA. 2022, vol. 162 of Proceedings of Machine Learning Research, pp. 14918–14934, PMLR.
[4] Keisuke Toyama, Taketo Akama, Yukara Ikemiya, Yuhta Takida, Wei-Hsiang Liao, and Yuki Mitsufuji, “Automatic piano transcription with hierarchical frequency-time transformer,” in Proceedings of the 24th International Society for Music Information Retrieval Conference, Milan, Italy, 2023, pp. 215–222.

Files

Files (103.8 MB)

Name	Size	Download all
high_resolution_MAESTRO_augmentations.pth md5:cd449c03d690e97dfe5d7c311ac8fa8c	103.8 MB	Download

	All versions	This version
Views	1,084	1,084
Downloads	228	228
Data volume	26.6 GB	26.6 GB

A Data-Driven Analysis of Robust Automatic Piano Transcription

Authors/Creators

Description

Files

Files (103.8 MB)