Published May 19, 2022 | Version 0.1
Dataset Restricted

Can I Play It? (CIPI) Dataset

  • 1. Music Technology Group - Universitat Pompeu Fabra
  • 2. Sogang University

Description

Can I Play It? (CIPI) dataset from Combining piano performance dimensions for score difficulty classification

Description

Overview

Predicting the difficulty of playing a musical score plays a pivotal role in structuring and exploring score collections, with significant implications for music education. The automatic difficulty classification of piano scores, however, remains an unsolved challenge. This is largely due to the scarcity of annotated data and the inherent subjectiveness in the annotation process. The "Can I Play It?" (CIPI) dataset represents a substantial step forward in this domain, providing a machine-readable collection of piano scores paired with difficulty annotations from the esteemed Henle Verlag.

Dataset Creation

The CIPI dataset is meticulously assembled by aligning public domain scores with their corresponding difficulty labels sourced from Henle Verlag. This initial pairing was subsequently reviewed and refined by an expert pianist to ensure accuracy and reliability. The dataset is structured to facilitate easy access and interpretation, making it a valuable resource for researchers and educators alike.

Contributions and Findings

Our work makes two primary contributions to the field of score difficulty classification. Firstly, we address the critical issue of data scarcity, introducing the CIPI dataset to the academic community. Secondly, we delve into various input representations derived from score information, utilizing pre-trained machine learning models tailored for piano fingering and expressiveness. These models draw inspiration from musicological definitions of performance, offering nuanced insights into score difficulty.

Through extensive experimentation, we demonstrate that an ensemble approach—combining outputs from multiple classifiers—yields superior results compared to individual classifiers. This highlights the diverse facets of difficulty captured by different representations. Our comprehensive experiments lay a robust foundation for future endeavors in score difficulty classification, and our best-performing model reports a balanced accuracy of 39.5% and a median square error of 1.1 across the nine difficulty levels introduced in this study.

Access and Usage

The CIPI dataset, along with the associated code and models, is made publicly available to ensure reproducibility and to encourage further research in this domain. Users are encouraged to reference this resource in their work and to contribute to its ongoing development.

Citation

Ramoneda, P., Jeong, D., Eremenko, V., Tamer, N. C., Miron, M., & Serra, X. (2024). Combining Piano Performance Dimensions for Score Difficulty Classification. Expert Systems with Applications, 238, 121776. DOI: 10.1016/j.eswa.2023.121776

@article{Ramoneda2024,
  author    = {Pedro Ramoneda and Dasaem Jeong and Vsevolod Eremenko and Nazif Can Tamer and Marius Miron and Xavier Serra},
  title     = {Combining Piano Performance Dimensions for Score Difficulty Classification},
  journal   = {Expert Systems with Applications},
  volume    = {238},
  pages     = {121776},
  year      = {2024},
  doi       = {10.1016/j.eswa.2023.121776},
  url       = {https://doi.org/10.1016/j.eswa.2023.121776}
}

Contact

pedro.ramoneda@upf.edu

xavier.serra@upf.edu

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Request access

If you would like to request access to these files, please fill out the form below.

You need to satisfy these conditions in order for this request to be accepted:

Disclaimer

The authors and their institution affiliations bear no responsibility for the uses of the Rodalies Dataset, or for
interpretations or inferences based on these uses.  Their institution affiliations accept no liability for
indirect, consequential, or incidental damages or losses arising from the use of the Rodalies Dataset,
or from the unavailability of, or break in access to the Dataset for whatever reason.


The author's institutions do not accept any responsibility or liability for data or material
contained on third party sites that reference the information on Rodalies Dataset or for the use
any person makes of such third party information.  The author's institutions do not monitor
this third party information makes no representations in relation to the quality or accuracy
of the information on third-party websites or Data Bases.

 

Use


The data is available for use and downloadable only for non-profit and academic research purposes.

 

Note: Please include, in the justification field, your academic affiliation and a brief description of your research topics and why you would like to use this dataset. If you do not include this information we can not approve your request.

You are currently not logged in. Do you have an account? Log in here