# Vanuatu Voices

## How to cite

If you use these data please cite
- the original source
  > Lana Takau, Tom Fitzpatrick, Mary Walworth, Aviva Shimelman, Sandrine Bessis, Tom Ennever, Iveth Rodriguez, Hans-Jörg Bibiko, Daria Dërmaku, Murray Garde, Marie-France Duhamel, Giovanni Abete, Laura Wägerle, Kaitip W. Kami, Tihomir Rangelov, & Russell Gray. (2025). Vanuatu Voices (v1.4.1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.4309140
- the derived dataset using the DOI of the [particular released version](../../releases/) you were using

## Description


Vanuatu Voices presents phonetically-transcribed primary recordings, from numerous villages throughout different islands, to both document and exhibit the extensive variation and unparalleled diversity of the Vanuatu languages.

This dataset is licensed under a CC-BY-NC-4.0 license

Available online at https://vanuatuvoices.clld.org

## Statistics


[![CLDF validation](https://github.com/lexibank/vanuatuvoices/workflows/CLDF-validation/badge.svg)](https://github.com/lexibank/vanuatuvoices/actions?query=workflow%3ACLDF-validation)
![Glottolog: 82%](https://img.shields.io/badge/Glottolog-82%25-yellowgreen.svg "Glottolog: 82%")
![Concepticon: 97%](https://img.shields.io/badge/Concepticon-97%25-green.svg "Concepticon: 97%")
![Source: 100%](https://img.shields.io/badge/Source-100%25-brightgreen.svg "Source: 100%")
![BIPA: 97%](https://img.shields.io/badge/BIPA-97%25-green.svg "BIPA: 97%")
![CLTS SoundClass: 97%](https://img.shields.io/badge/CLTS%20SoundClass-97%25-green.svg "CLTS SoundClass: 97%")

- **Varieties:** 236 (linked to 69 different Glottocodes)
- **Concepts:** 432 (linked to 319 different Concepticon concept sets)
- **Lexemes:** 45,255
- **Sources:** 7
- **Synonymy:** 1.02
- **Invalid lexemes:** 0
- **Tokens:** 283,273
- **Segments:** 264 (9 BIPA errors, 9 CLTS sound class errors, 254 CLTS modified)
- **Inventory size (avg):** 41.68

# Contributors

Name               | GitHub user     | Description                                                                                                                                                                                                                                                                                         | Role
---                | ---             |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ---
Lana Takau |  | Translated the Vanuatu study into Bislama, edited and transcribed all legacy recordings for Pentecost Island, recorded and transcribed languages from Pentecost and Malekula, assisted with editing and transcription of all languages                                                              | Author
Tom Fitzpatrick | |                                                                                                                                                                                                                                                                                                     | Author
Mary Walworth |  | Coordinator (since 2017) of the Vanuatu study and its expansion from Malekula island to other islands of Vanuatu. Recorded and transcribed data from Emae, Epi, and Efate. Directed fieldworkers to Malekula, Maewo, Ambae, Epi, and Pentecost and obtained legacy recordings for Pentecost Island. | Author
Aviva Shimelman |  | Recorded and transcribed Malekula languages from 2015 to 2017                                                                                                                                                                                                                                       | Author
Sandrine Bessis |  | Recorded and transcribed languages of Epi Island in 2019                                                                                                                                                                                                                                            | Author
Tom Ennever |  | Together with Iveth Rodriguez, recorded and transcribed languages from Ambae and Maewo Islands                                                                                                                                                                                                      | Author
Iveth Rodriguez |  | Together with Tom Ennever, recorded and transcribed languages from Ambae and Maewo Islands                                                                                                                                                                                                          | Author
Hans-Jörg Bibiko | @Bibiko | Programmer and maintainer of the data repository, assisted with data curation and data converting                                                                                                                                                                                                   | Author
Daria Dërmaku |  | Provided audio post-processing and mark-up from 2016-2020                                                                                                                                                                                                                                           | Author
Murray Garde |  | Provided the recordings of 6 dialects of the Sa language on Pentecost island                                                                                                                                                                                                                        | Author
Marie-France Duhamel |  | Provided the recordings of the Raga and Lolkasai languages on Pentecost island                                                                                                                                                                                                                      | Author
Giovanni Abete |  | Provided detailed phonetic transcription for many Malekula languages                                                                                                                                                                                                                                | Author
Laura Wägerle |  | Provided audio post-processing and mark-up 2016-2018                                                                                                                                                                                                                                                | Author
Kaitip W. Kami |  | Coordinated and assisted with recordings of many Malekula languages                                                                                                                                                                                                                                 | Author
Tihomir Rangelov |  | Provided recordings and transcriptions for all Santo languages                                                                                                                                                                                                                                      | Author
Russell Gray |  | Co-Director of the Vanuatu Languages and Lifeways project (2016-2018) and Director of the Department of Linguistic and Cultural Evolution, which fully supported data collection and processing for the Vanuatu study.                                                                              | Author
Robert Forkel | @xrotwang | CLDF data conversion                                                                                                                                                                                                                                                                                | DataCurator
Johann-Mattis List |  | CLDF data conversion and orthography profile creation                                                                                                                                                                                                                                               | Other




## CLDF Datasets

The following CLDF datasets are available in [cldf](cldf):

- CLDF [Wordlist](https://github.com/cldf/cldf/tree/master/modules/Wordlist) at [cldf/cldf-metadata.json](cldf/cldf-metadata.json)