Published April 11, 2020 | Version v1
Video/Audio Open

Human vocalization corpus: recordings of infant-directed and adult-directed speech and song in 21 societies

  • 1. University of Auckland
  • 2. University of California at Merced
  • 3. University of Auckland and Yale Child Study Center

Description

This repository contains a corpus of 1615 audio recordings of speech and song collected in 21 societies, first reported in Moser et al. (2020; bioRxiv) and later published in Hilton & Moser et al. (2022; Nature Human Behaviour). For assistance using any of this, contact Cody Moser (cmoser2@ucmerced.edu), Courtney Hilton (courtney.hilton@auckland.ac.nz), and Samuel Mehr (mehr@hey.com).

Two versions of the audio are included: raw audio (`IDS-corpus-raw.zip`) and audio that was edited to prepare the recordings for automatic acoustic feature extraction (`IDS-corpus-edited.zip`). `IDS-textGrids.zip` contains annotation files from Praat's silence detection method, which were manually reviewed for accuracy. These files are used with the audio extraction scripts associated with the project (see code linked in paper) to build the edited audio files.

`IDS-fieldsites.csv` contains some fieldsite-level metadata; additional metadata is in the Supplementary Information of the paper.

In the two .zip archives, filenames have the format XXXYYZ.wav, where "XXX" is a fieldsite code, "YY" is a participant number, and "Z" is a vocalization type.

Fieldsite codes are:

MBE: Mbendjele BaYaka
HAD: Hadza
NYA: Nyangatom
TOP: Toposa
BEJ: Beijing
JEN: Jenu Kurubas
MEN: Mentawai Islanders
KRA: Krakow
LIM: Rural Poland
TUR: Turku
USD: San Diego
TOR: Toronto
VAN: Tannese Vanuatuans
PNG: Enga
WEL: Wellington
ARA: Arawak
TSI: Tsimane
SPA: Sápara & Achuar
QUE: Quechua
ACO: Afrocolombians
MES: Colombian Mestizos

Participant numbers are padded integers, starting with 01, and are unique within fieldsites.

Vocalization types are:

A: infant-directed song
B: infant-directed speech
C: adult-directed song
D: adult-directed speech 

In a few cases, participants vocalized in a different language than was expected, given the primary language of their fieldsite (e.g., when the participant was multilingual, or if they sang a song that contains multiple languages, as in The Beatles' "Michelle"). The file `IDS-unexpectedLanguages.csv` at https://github.com/themusiclab/infant-speech-song/blob/main/data/IDS-unexpectedLanguages.csv contains an inventory of these examples from the English-speaking fieldsites. This issue only affects a small minority of the recordings, as it was typically avoided by the researchers collecting the recordings. 

Files

IDS-corpus-edited.zip

Files (11.3 GB)

Name Size Download all
md5:143cfcde407b123635b1fbe45b1fabe6
5.8 GB Preview Download
md5:b565689eaf3c5fb03a6db2f63cd9c0eb
5.6 GB Preview Download
md5:6b6eba7e881e4ee1fa942f490f2b899a
1.1 kB Preview Download
md5:3eb004a3d5ce1dd897f37e9b55f329ce
871.2 kB Preview Download