Published August 8, 2023 | Version v5
Video/Audio Open

Cross-cultural music corpus: The Expanded Natural History of Song Discography

  • 1. McGill University
  • 2. University of Auckland
  • 3. The University of Auckland
  • 4. ROR icon University of Amsterdam
  • 5. ROR icon University of California, Los Angeles
  • 6. ROR icon Princeton University
  • 7. ROR icon University of Chicago
  • 8. ROR icon University of British Columbia
  • 9. ROR icon Harvard University
  • 10. ROR icon Rutgers, The State University of New Jersey
  • 11. ROR icon Tel Aviv University
  • 12. University of Connecticut
  • 13. ROR icon City University of New York
  • 14. University of Ottowa
  • 15. UC Davis

Description

This repository hosts the Expanded Natural History of Song Discography. It contains 1007 audio recordings of vocal music gathered from many human societies, each annotated with a world region, language, and behavioural context. 

Each song file contains a 10-second excerpt of the source audio, selected at random from only portions of the recording that contain an audible singer. Given the short form of each excerpt, and the intended use of these files only for research purposes, they have been made available under Fair Use.

NHS2-songs.zip contains the audio files, volume-matched and with 1s fade in/out added, in MP3 format. These can be analysed as-is or used in experiments.

NHS2-metadata.csv contains annotations, where each row corresponds to a song. The four columns include song, which includes a unique identifier for each song in the format `NHS2-XXXX.mp3`; region, which indicates an approximate geographical location where the song was recorded, using Human Relations Area Files categories (see https://ehrafworldcultures.yale.edu); glottocode, which indicates the language in which the song is produced (see https://glottolog.org); and type, which indicates the behavioural context in which the song was produced, from a set of 10 categories (dance, healing, love, lullaby, play, procession, mourning, work, story, and praise).

For assistance with the corpus, contact Martynas Snarskis (martysnarskis@gmail.com), Mila Bertolo (mila.bertolo@mail.mcgill.ca), and Samuel Mehr (mehr@hey.com).

Further information about the construction of this corpus will be made available in a forthcoming paper; we will update this Zenodo archive when the paper is publicly available.

Notes

Version 5 updates the audio selection for one of the songs (NHS2-E2SX), which previously did not include vocals

Files

NHS2-metadata.csv

Files (235.8 MB)

Name Size Download all
md5:31f4e51f88b25b05a0bc940bb1a376a8
41.9 kB Preview Download
md5:82c2750225088bf7903ed520a63896f1
235.8 MB Preview Download