Dataset Open Access

Experiments of the Paper "MORTY: A Toolbox for Mode Recognition and Tonic Identification"

Sertan Şentürk

JSON-LD ( Export

  "description": "<p>This package contains the complete experimental data explained in:</p>\n\n<blockquote>\n<p>Karakurt, A.,\u00a0\u015eent\u00fcrk S., &amp;\u00a0Serra X.\u00a0(In Press).\u00a0\u00a0MORTY: A Toolbox for Mode Recognition and Tonic Identification.\u00a03rd International Digital Libraries for Musicology Workshop.\u00a0</p>\n</blockquote>\n\n<p>Please cite the paper above, if you are using the data in your work.</p>\n\n<p>The zip file includes the\u00a0folds, features, training and testing data, results and\u00a0evaluation file. It is part of the experiments hosted in github ( in the \u00a0folder call \".<strong>/data</strong>\". We host the experimental data\u00a0in Zenodo (\u00a0separately due to the file size limitations in github.</p>\n\n<p>The files generated from audio recordings are labeled with 16 character long MusicBrainz IDs (in short \"MBID\"s) Please check\u00a0 for more information about the unique identifiers.\u00a0The structure of the data in the zip file is explained below.\u00a0In the paths\u00a0given below <em>task</em> is the computational task (\"tonic,\" \"mode\" or \"joint\"),\u00a0<em>training_type</em>\u00a0is either \"single\" (-distribution per mode) or \"multi\" (-distribution per mode),\u00a0\u00a0<em>distribution</em>\u00a0is either \"pcd\" (pitch class distribution) or \"pd\" (pitch distribution), <em>bin_size</em>\u00a0is the bin size of the distribution in cents, <em>kernel_width</em>\u00a0is the standard deviation of the Gaussian kernel used in smoothing the distribution, <em>distance</em>\u00a0is either the distance or the dissimilarity metric,\u00a0<em>num_neighbors</em>\u00a0is the number or neighbors checked in <em>k</em>-nearest neighbor classification and\u00a0<em>min_peak</em> is the minimum peak ratio. 0 <em>kernel_width</em>\u00a0implies no smoothing. <em>min_peak\u00a0</em>always takes the value 0.15.\u00a0For a thorough explanation please refer to the companion page ( and the paper itself.</p>\n\n<ul>\n\t<li><strong>folds.json:\u00a0</strong>Divides the test dataset ( into training and testing sets according to stratified 10-fold scheme. The annotations are also distributed to sets accordingly. The file is generated by\u00a0\u00a0the Jupyter notebook\u00a0<em>setup_feature_training.ipynb (4th code block)</em>\u00a0in the github experiments repository\u00a0(</li>\n\t<li><strong>Features: \u00a0</strong>The path is <strong>data/features/[distribution--bin_size--kernel_width]/[MBID--(hist </strong><em>or\u00a0</em><strong>pdf)].json</strong>. \"pdf\" stands for\u00a0probability\u00a0density function, which is used to obtain the multi-distribution models in the training step and \"hist\" stands for the histogram, which is used to obtain the single-distribution models in the training step. The features are extracted using the Jupyter notebook <em>setup_feature_training.ipynb (5th code block)</em>\u00a0in the github experiments repository\u00a0(</li>\n\t<li><strong>Training:\u00a0</strong>The path is <strong>data/training/[training_type--distribution--bin_size--kernel_width]/fold(0:9).json]</strong>. There are 10 folds in each folder, each of which stores the training model (file paths of the <em>distribution</em>s\u00a0in \"multi\" <em>training_type</em>\u00a0or the <em>distribution</em>s itself in \"single\" <em>training_type</em>) trained for the fold using the parameter set. The training files are generated by the\u00a0Jupyter notebook\u00a0<em>setup_feature_training.ipynb (6th code block)</em>\u00a0in the github experiments repository\u00a0(</li>\n\t<li><strong>Testing: </strong>The path is <strong>data/testing/[task]/[training_type--distribution--bin_size--kernel_width--distance--num_neighbors--min_peak]</strong>. Each path has the folders <strong>fold(0:9)</strong>, which have the evaluation and the results files obtained from each fold.\u00a0The path also has the\u00a0<strong>overall_eval.json</strong>\u00a0file, which stores the overall\u00a0evaluation of the experiment.\u00a0The optimal value of\u00a0<em>min_peak </em>is selected in the 4th code block, testing is carried in the 6th code clock and the evaluation is done in the 7th\u00a0code block\u00a0in\u00a0the\u00a0Jupyter notebook\u00a0<em>testing_evaluation.ipynb</em>\u00a0in the github experiments repository (\u00a0<br>\n\t<strong>data/testing/\u00a0</strong>folder also contains a summary of all the experiments in\u00a0the\u00a0files\u00a0<strong>data/testing/evaluation_overall.json\u00a0</strong>and\u00a0<strong>data/testing/evaluation_perfold.json</strong>.\u00a0These files are created in MATLAB while running the\u00a0statistical significance scripts.\u00a0<strong>data/testing/evaluation_perfold.mat </strong>is the same with the json file of the same filename, stored for fast reading.</li>\n</ul>\n\n<p>For additional information please contact the authors.</p>\n\n<p>This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.</p>", 
  "license": "", 
  "creator": [
      "affiliation": "Universitat Pompeu Fabra", 
      "@type": "Person", 
      "name": "Sertan \u015eent\u00fcrk"
  "url": "", 
  "datePublished": "2016-07-14", 
  "keywords": [
    "Ottoman-Turkish makam music", 
    "mode recognition", 
    "tonic identification", 
    "k-nearest neighbors", 
    "pitch class distribution", 
    "open source software"
  "@context": "", 
  "distribution": [
      "contentUrl": "", 
      "encodingFormat": "zip", 
      "@type": "DataDownload"
  "identifier": "", 
  "@id": "", 
  "@type": "Dataset", 
  "name": "Experiments of the Paper \"MORTY: A Toolbox for Mode Recognition and Tonic Identification\""
All versions This version
Views 506507
Downloads 1212
Data volume 12.2 GB12.2 GB
Unique views 500501
Unique downloads 1212


Cite as