Published July 14, 2016 | Version v1
Dataset Open

Experiments of the Paper "MORTY: A Toolbox for Mode Recognition and Tonic Identification"

  • 1. Universitat Pompeu Fabra


This package contains the complete experimental data explained in:

Karakurt, A., Şentürk S., & Serra X. (In Press).  MORTY: A Toolbox for Mode Recognition and Tonic Identification. 3rd International Digital Libraries for Musicology Workshop. 

Please cite the paper above, if you are using the data in your work.

The zip file includes the folds, features, training and testing data, results and evaluation file. It is part of the experiments hosted in github ( in the  folder call "./data". We host the experimental data in Zenodo ( separately due to the file size limitations in github.

The files generated from audio recordings are labeled with 16 character long MusicBrainz IDs (in short "MBID"s) Please check for more information about the unique identifiers. The structure of the data in the zip file is explained below. In the paths given below task is the computational task ("tonic," "mode" or "joint"), training_type is either "single" (-distribution per mode) or "multi" (-distribution per mode),  distribution is either "pcd" (pitch class distribution) or "pd" (pitch distribution), bin_size is the bin size of the distribution in cents, kernel_width is the standard deviation of the Gaussian kernel used in smoothing the distribution, distance is either the distance or the dissimilarity metric, num_neighbors is the number or neighbors checked in k-nearest neighbor classification and min_peak is the minimum peak ratio. 0 kernel_width implies no smoothing. min_peak always takes the value 0.15. For a thorough explanation please refer to the companion page ( and the paper itself.

  • folds.json: Divides the test dataset ( into training and testing sets according to stratified 10-fold scheme. The annotations are also distributed to sets accordingly. The file is generated by  the Jupyter notebook setup_feature_training.ipynb (4th code block) in the github experiments repository (
  • Features:  The path is data/features/[distribution--bin_size--kernel_width]/[MBID--(hist or pdf)].json. "pdf" stands for probability density function, which is used to obtain the multi-distribution models in the training step and "hist" stands for the histogram, which is used to obtain the single-distribution models in the training step. The features are extracted using the Jupyter notebook setup_feature_training.ipynb (5th code block) in the github experiments repository (
  • Training: The path is data/training/[training_type--distribution--bin_size--kernel_width]/fold(0:9).json]. There are 10 folds in each folder, each of which stores the training model (file paths of the distributions in "multi" training_type or the distributions itself in "single" training_type) trained for the fold using the parameter set. The training files are generated by the Jupyter notebook setup_feature_training.ipynb (6th code block) in the github experiments repository (
  • Testing: The path is data/testing/[task]/[training_type--distribution--bin_size--kernel_width--distance--num_neighbors--min_peak]. Each path has the folders fold(0:9), which have the evaluation and the results files obtained from each fold. The path also has the overall_eval.json file, which stores the overall evaluation of the experiment. The optimal value of min_peak is selected in the 4th code block, testing is carried in the 6th code clock and the evaluation is done in the 7th code block in the Jupyter notebook testing_evaluation.ipynb in the github experiments repository ( 
    data/testing/ folder also contains a summary of all the experiments in the files data/testing/evaluation_overall.json and data/testing/evaluation_perfold.json. These files are created in MATLAB while running the statistical significance scripts. data/testing/evaluation_perfold.mat is the same with the json file of the same filename, stored for fast reading.

For additional information please contact the authors.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


Files (1.0 GB)

Name Size Download all
1.0 GB Preview Download

Additional details


COMPMUSIC – Computational models for the discovery of the world's music 267583
European Commission