PiJAMA: Piano Jazz with Automatic MIDI Annotations
Description
Release of the automatic MIDI transcriptions that constitute the PiJAMA dataset. See below for the abstract of the publication.
Abstract:
Recent advances in automatic piano transcription have enabled large scale analysis of piano music in the symbolic domain. However, the research has largely focused on classical piano music. We present PiJAMA (Piano Jazz with Automatic MIDI Annotations): a dataset of over 200 hours of solo jazz piano performances with automatically transcribed MIDI. In total there are 2,777 unique performances by 120 different pianists across 244 recorded albums. The dataset contains a mixture of studio recordings and live performances. We use automatic audio tagging to identify applause, spoken introductions, and other non-piano audio to facilitate downstream music information retrieval tasks. We explore descriptive statistics of the MIDI data, including pitch histograms and chromaticism. We then demonstrate two experimental benchmarks on the data: performer identification and generative modeling. The dataset, including a link to the associated source code is available at https://almostimplemented.github.io/PiJAMA/.
Files
midi_hawthorne.zip
Files
(60.4 MB)
Name | Size | Download all |
---|---|---|
md5:29b8ef8500f508abde4e5c8d76895f93
|
24.0 MB | Preview Download |
md5:9920ab88ec11757ca0c138478ef27653
|
36.4 MB | Preview Download |
Additional details
Funding
- UK Research and Innovation
- UKRI Centre for Doctoral Training in Artificial Intelligence and Music EP/S022694/1