Dataset Open Access

Experiments of the Paper "MORTY: A Toolbox for Mode Recognition and Tonic Identification"

Sertan Şentürk

DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="" xmlns="" xsi:schemaLocation="">
  <identifier identifierType="DOI">10.5281/zenodo.57999</identifier>
      <creatorName>Sertan Şentürk</creatorName>
      <affiliation>Universitat Pompeu Fabra</affiliation>
    <title>Experiments of the Paper "MORTY: A Toolbox for Mode Recognition and Tonic Identification"</title>
    <subject>Ottoman-Turkish makam music</subject>
    <subject>mode recognition</subject>
    <subject>tonic identification</subject>
    <subject>k-nearest neighbors</subject>
    <subject>pitch class distribution</subject>
    <subject>open source software</subject>
    <date dateType="Issued">2016-07-14</date>
  <resourceType resourceTypeGeneral="Dataset"/>
    <alternateIdentifier alternateIdentifierType="url"></alternateIdentifier>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsPartOf"></relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsPartOf"></relatedIdentifier>
    <rights rightsURI="">Creative Commons Attribution Non Commercial Share Alike 4.0 International</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
    <description descriptionType="Abstract">&lt;p&gt;This package contains the complete experimental data explained in:&lt;/p&gt;

&lt;p&gt;Karakurt, A., Şentürk S., &amp;amp; Serra X. (In Press).  MORTY: A Toolbox for Mode Recognition and Tonic Identification. 3rd International Digital Libraries for Musicology Workshop. &lt;/p&gt;

&lt;p&gt;Please cite the paper above, if you are using the data in your work.&lt;/p&gt;

&lt;p&gt;The zip file includes the folds, features, training and testing data, results and evaluation file. It is part of the experiments hosted in github ( in the  folder call ".&lt;strong&gt;/data&lt;/strong&gt;". We host the experimental data in Zenodo ( separately due to the file size limitations in github.&lt;/p&gt;

&lt;p&gt;The files generated from audio recordings are labeled with 16 character long MusicBrainz IDs (in short "MBID"s) Please check for more information about the unique identifiers. The structure of the data in the zip file is explained below. In the paths given below &lt;em&gt;task&lt;/em&gt; is the computational task ("tonic," "mode" or "joint"), &lt;em&gt;training_type&lt;/em&gt; is either "single" (-distribution per mode) or "multi" (-distribution per mode),  &lt;em&gt;distribution&lt;/em&gt; is either "pcd" (pitch class distribution) or "pd" (pitch distribution), &lt;em&gt;bin_size&lt;/em&gt; is the bin size of the distribution in cents, &lt;em&gt;kernel_width&lt;/em&gt; is the standard deviation of the Gaussian kernel used in smoothing the distribution, &lt;em&gt;distance&lt;/em&gt; is either the distance or the dissimilarity metric, &lt;em&gt;num_neighbors&lt;/em&gt; is the number or neighbors checked in &lt;em&gt;k&lt;/em&gt;-nearest neighbor classification and &lt;em&gt;min_peak&lt;/em&gt; is the minimum peak ratio. 0 &lt;em&gt;kernel_width&lt;/em&gt; implies no smoothing. &lt;em&gt;min_peak &lt;/em&gt;always takes the value 0.15. For a thorough explanation please refer to the companion page ( and the paper itself.&lt;/p&gt;

	&lt;li&gt;&lt;strong&gt;folds.json: &lt;/strong&gt;Divides the test dataset ( into training and testing sets according to stratified 10-fold scheme. The annotations are also distributed to sets accordingly. The file is generated by  the Jupyter notebook &lt;em&gt;setup_feature_training.ipynb (4th code block)&lt;/em&gt; in the github experiments repository (;/li&gt;
	&lt;li&gt;&lt;strong&gt;Features:  &lt;/strong&gt;The path is &lt;strong&gt;data/features/[distribution--bin_size--kernel_width]/[MBID--(hist &lt;/strong&gt;&lt;em&gt;or &lt;/em&gt;&lt;strong&gt;pdf)].json&lt;/strong&gt;. "pdf" stands for probability density function, which is used to obtain the multi-distribution models in the training step and "hist" stands for the histogram, which is used to obtain the single-distribution models in the training step. The features are extracted using the Jupyter notebook &lt;em&gt;setup_feature_training.ipynb (5th code block)&lt;/em&gt; in the github experiments repository (;/li&gt;
	&lt;li&gt;&lt;strong&gt;Training: &lt;/strong&gt;The path is &lt;strong&gt;data/training/[training_type--distribution--bin_size--kernel_width]/fold(0:9).json]&lt;/strong&gt;. There are 10 folds in each folder, each of which stores the training model (file paths of the &lt;em&gt;distribution&lt;/em&gt;s in "multi" &lt;em&gt;training_type&lt;/em&gt; or the &lt;em&gt;distribution&lt;/em&gt;s itself in "single" &lt;em&gt;training_type&lt;/em&gt;) trained for the fold using the parameter set. The training files are generated by the Jupyter notebook &lt;em&gt;setup_feature_training.ipynb (6th code block)&lt;/em&gt; in the github experiments repository (;/li&gt;
	&lt;li&gt;&lt;strong&gt;Testing: &lt;/strong&gt;The path is &lt;strong&gt;data/testing/[task]/[training_type--distribution--bin_size--kernel_width--distance--num_neighbors--min_peak]&lt;/strong&gt;. Each path has the folders &lt;strong&gt;fold(0:9)&lt;/strong&gt;, which have the evaluation and the results files obtained from each fold. The path also has the &lt;strong&gt;overall_eval.json&lt;/strong&gt; file, which stores the overall evaluation of the experiment. The optimal value of &lt;em&gt;min_peak &lt;/em&gt;is selected in the 4th code block, testing is carried in the 6th code clock and the evaluation is done in the 7th code block in the Jupyter notebook &lt;em&gt;testing_evaluation.ipynb&lt;/em&gt; in the github experiments repository ( &lt;br&gt;
	&lt;strong&gt;data/testing/ &lt;/strong&gt;folder also contains a summary of all the experiments in the files &lt;strong&gt;data/testing/evaluation_overall.json &lt;/strong&gt;and &lt;strong&gt;data/testing/evaluation_perfold.json&lt;/strong&gt;. These files are created in MATLAB while running the statistical significance scripts. &lt;strong&gt;data/testing/evaluation_perfold.mat &lt;/strong&gt;is the same with the json file of the same filename, stored for fast reading.&lt;/li&gt;

&lt;p&gt;For additional information please contact the authors.&lt;/p&gt;

&lt;p&gt;This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.&lt;/p&gt;</description>
      <funderName>European Commission</funderName>
      <funderIdentifier funderIdentifierType="Crossref Funder ID">10.13039/501100000780</funderIdentifier>
      <awardNumber awardURI="info:eu-repo/grantAgreement/EC/FP7/267583/">267583</awardNumber>
      <awardTitle>Computational models for the discovery of the world's music</awardTitle>
All versions This version
Views 506507
Downloads 1212
Data volume 12.2 GB12.2 GB
Unique views 500501
Unique downloads 1212


Cite as