MuseSyn: A dataset for complete automatic piano music transcription research
Description
The MuseSyn (v1.0) dataset is a dataset created for complete automatic music transcription, consisting of 210 pieces of piano music. Music scores in MusicXML format are collected from MuseScore website; they are further converted into MIDI format and synthesized to audio files using four different piano models provided in the Native Instruments Kontakt Player. The scores collected cover various key signatures and time signatures, tempos, modes, and polyphony levels, but do not contain things like grace notes, triplets, and trios.
Music scores are provided in MIDI and MusicXML formats, synthesized audio files are saved in lossless compressed (.flac) format, 44100 sample rate, 16-bit encoding depth. An equal level of reverb effects is added during synthesis to make the audio files more similar to real recordings.
Please send any feedback or questions to Lele Liu at lele.liu@qmul.ac.uk.
How to cite: If you use this dataset, please, provide the following citation in your work:
- L. Liu, V. Morfi and E. Benetos, "Joint Multi-pitch Detection and Score Transcription for Polyphonic Piano Music", IEEE International Conference on Acoustics, Speech and Signal Processing, 2021.
Funding: L. Liu is a research student at the UKRI Centre for Doctoral Training in Artificial Intelligence and Music, supported jointly by the China Scholarship Council and Queen Mary University of London.
Files
Additional details
References
- Lele Liu, Veronica Morfi, and Emmanouil Benetos. (2021). MuseSyn: A dataset for complete automatic piano music transcription research (Version 1.0). [Dataset]. Zenodo.