Published September 15, 2023 | Version 1.0.0
Dataset Open

PiJAMA: Piano Jazz with Automatic MIDI Annotations

  • 1. Queen Mary University of London

Description

Release of the automatic MIDI transcriptions that constitute the PiJAMA dataset. See below for the abstract of the publication.

Abstract:

Recent advances in automatic piano transcription have enabled large scale analysis of piano music in the symbolic domain. However, the research has largely focused on classical piano music. We present PiJAMA (Piano Jazz with Automatic MIDI Annotations): a dataset of over 200 hours of solo jazz piano performances with automatically transcribed MIDI. In total there are 2,777 unique performances by 120 different pianists across 244 recorded albums. The dataset contains a mixture of studio recordings and live performances. We use automatic audio tagging to identify applause, spoken introductions, and other non-piano audio to facilitate downstream music information retrieval tasks. We explore descriptive statistics of the MIDI data, including pitch histograms and chromaticism. We then demonstrate two experimental benchmarks on the data: performer identification and generative modeling. The dataset, including a link to the associated source code is available at https://almostimplemented.github.io/PiJAMA/.

Files

midi_hawthorne.zip

Files (60.4 MB)

Name Size Download all
md5:29b8ef8500f508abde4e5c8d76895f93
24.0 MB Preview Download
md5:9920ab88ec11757ca0c138478ef27653
36.4 MB Preview Download

Additional details

Funding

UK Research and Innovation
UKRI Centre for Doctoral Training in Artificial Intelligence and Music EP/S022694/1