Cross-cultural music corpus: The Expanded Natural History of Song Discography
Creators
-
Mila Bertolo1
-
Martynas Snarskis2
-
Kyritsis, Thanos3
-
Yurdum, Lidya4
-
Bainbridge, Constance5
-
Atwood, S.6
-
Hilton, Courtney2
- Keomurjian, Anya
- Lee, Judy S.
-
Mackiel, Alexander7
- Mak, Vanessa8
- Shin, Mijoo9
-
Bitran, Alma10
-
Shilton, Dor11
-
Delasanta, Lana12
- Do, Hang (Heather)13
- Lang, Jenna
- Irani, Tenaaz14
- Kangatharan, Jayanthiny
- Lafleur, Kevin
- Malko, Nashua
-
Atkinson, Quentin2
-
Manvir Singh15
-
Samuel Mehr2
- 1. McGill University
- 2. University of Auckland
- 3. The University of Auckland
-
4.
University of Amsterdam
-
5.
University of California, Los Angeles
-
6.
Princeton University
-
7.
University of Chicago
-
8.
University of British Columbia
-
9.
Harvard University
-
10.
Rutgers, The State University of New Jersey
-
11.
Tel Aviv University
- 12. University of Connecticut
-
13.
City University of New York
- 14. University of Ottowa
- 15. UC Davis
Description
This repository hosts the Expanded Natural History of Song Discography. It contains 1007 audio recordings of vocal music gathered from many human societies, each annotated with a world region, language, and behavioural context.
Each song file contains a 10-second excerpt of the source audio, selected at random from only portions of the recording that contain an audible singer. Given the short form of each excerpt, and the intended use of these files only for research purposes, they have been made available under Fair Use.
NHS2-songs.zip contains the audio files, volume-matched and with 1s fade in/out added, in MP3 format. These can be analysed as-is or used in experiments.
NHS2-metadata.csv contains annotations, where each row corresponds to a song. The four columns include song, which includes a unique identifier for each song in the format `NHS2-XXXX.mp3`; region, which indicates an approximate geographical location where the song was recorded, using Human Relations Area Files categories (see https://ehrafworldcultures.yale.edu); glottocode, which indicates the language in which the song is produced (see https://glottolog.org); and type, which indicates the behavioural context in which the song was produced, from a set of 10 categories (dance, healing, love, lullaby, play, procession, mourning, work, story, and praise).
For assistance with the corpus, contact Martynas Snarskis (martysnarskis@gmail.com), Mila Bertolo (mila.bertolo@mail.mcgill.ca), and Samuel Mehr (mehr@hey.com).
Further information about the construction of this corpus will be made available in a forthcoming paper; we will update this Zenodo archive when the paper is publicly available.
Notes
Files
NHS2-metadata.csv
Files
(235.8 MB)
Name | Size | Download all |
---|---|---|
md5:31f4e51f88b25b05a0bc940bb1a376a8
|
41.9 kB | Preview Download |
md5:82c2750225088bf7903ed520a63896f1
|
235.8 MB | Preview Download |