Planned intervention: On Thursday 19/09 between 05:30-06:30 (UTC), Zenodo will be unavailable because of a scheduled upgrade in our storage cluster.
Published May 31, 2021 | Version dlfm2016-fix1
Dataset Open

MTG/otmm_makam_recognition_dataset: Ottoman-Turkish Makam Music Makam Recognition Dataset

  • 1. Kobalt Music, @CompMusic
  • 2. Avid Technology, formerly @MTG

Description

OTMM Makam Recognition Dataset

This repository hosts the dataset designed to test makam recognition methodologies on Ottoman-Turkish makam music. It is composed of 50 recordings from each of the 20 most common makams in CompMusic Project's Dunya Ottoman-Turkish Makam Music collection. Currently, the dataset is the largest makam recognition dataset.

Please cite the publication below if you use this dataset in your work:

Karakurt, A., Şentürk S., & Serra X. (2016). MORTY: A Toolbox for Mode Recognition and Tonic Identification. 3rd International Digital Libraries for Musicology Workshop. New York, USA

The recordings are selected from commercial recordings carefully such that they cover diverse musical forms, vocal/instrumentation settings, and recording qualities (e.g. historical vs contemporary). Each recording in the dataset is identified by a 16-character long unique identifier called MBID, hosted in MusicBrainz. The makam and the tonic of each recording are annotated in the file annotations.json.

The audio-related data in the test dataset is organized by each makam in the folder data. Due to copyright reasons, we are unable to distribute the audio. Instead, we provide the predominant melody of each recording, computed by a state-of-the-art predominant melody extraction algorithm optimized for OTMM culture. These features are saved as text files (with the paths data/[makam]/[mbid].pitch) of a single column that contains the frequency values. The timestamps are removed to reduce the filesizes. The step size of the pitch track is 0.0029 seconds (an analysis window of 128 samples hop size of an mp3 with 44100 Hz sample rate), with which one can recompute the timestamps of samples.

Moreover, the metadata of each recording is available in the repository, crawled from MusicBrainz using an open source tool developed by us. The metadata files are saved as data/[makam]/[mbid].json.

For reproducibility purposes, we note the version of all tools we have used to generate this dataset in the file algorithms.json.

A complementary toolbox for this dataset is MORTY, which is a mode recognition and tonic identification toolbox. It can be used and optimized for any modal music culture. Further details are explained in the publication above.

For more information, please contact the authors.

Errata

  • April 2020: We replaced 2 recordings, which do not exist in CompMusic Dunya makam corpus, with their instrumental versions. We also patched the dunya_uid of a recording, which is a redirection to the MusicBrainz ID. None of the annotations has changed. (PR #1)
  • November 2016: We discovered several discrepancies in the tonic annotations while merging human and machine annotations to create the otmm_tonic_dataset. Please refer to the repo for further explanation. We advise to use the tonic annotations in otmm_tonic_dataset.

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Files

MTG/otmm_makam_recognition_dataset-dlfm2016-fix1.zip

Files (101.0 MB)

Additional details