Saraga: research datasets of Indian Art Music
- 1. Music Technology Group, Universitat Pompeu Fabra
Description
Dataset introduction
This repository contains time aligned melody, rhythm, and structural annotations for two large open corpora of Indian Art Music (Carnatic and Hindustani music).
The repository contains Carnatic and Hindustani collections in separated zip files, and each collection is organized by songs grouped by artist concerts/live performances. This organization follows the structure generated by downloading the data using the scripts available at the dataset Github repository: https://github.com/MTG/saraga.
Moreover, there is a part of the Carnatic collection, 168 tracks to be specific, that counts with multitrack audio files apart from the mix audio. The considered instruments are: Ghatam, Mridangam, Violin, Voice and Secondary Voice.
Annotations in the dataset
Section and tempo annotations stored as start and end timestamps together with the name of the section and tempo during the section (in a separate file). Sama annotations referring to rhythmic cycle boundaries stored as timestamps. Phrase annotations stored as timestamps and transcription of the phrases using solfège symbols ({S, r, R, g, G, m, M, P, d, D, n, N}). Audio features automatically extracted and stored: pitch and tonic.
For more information about the dataset tracks and annotations, please refer to the Saraga website: https://mtg.github.io/saraga/
Using this dataset
We are interested in knowing if you find our datasets useful! If you use our dataset please email us at mtg-info@upf.edu and tell us about your research.
*Please note that you can also use this dataset through the MIRDATA library (https://github.com/mir-dataset-loaders/mirdata), where this dataset is in the list of available datasets.
Files
saraga1.5_carnatic.zip
Files
(18.5 GB)
Name | Size | Download all |
---|---|---|
md5:e4fcd380b4f6d025964cd16aee00273d
|
14.4 GB | Preview Download |
md5:ea9ed2885ea37a1b10e42f60cf299702
|
4.1 GB | Preview Download |