VanBib: A database of references to resources on Vanuatu languages
Contributors
Annotator:
Data collectors:
-
Rangelov, Tihomir1, 2
-
Ridge, Eleanor3
-
François, Alexandre5
-
Levisen, Carsten6
-
Gooskens, Charlotte7
- Schneider, Cindy8
-
Krauße, Daniel9, 10, 11
-
Pearce, Elizabeth
-
Willans, Fiona12
-
Hopperdietzel, Jens
-
Duhamel, Marie-France
-
Franjieh, Michael13
-
Meyerhoff, Miriam14, 15
-
Séverin, Noémie
- Clark, Ross16
- Bryant, Mike17
- 1. Max Planck Institute for Evolutionary Anthropology
- 2. University of Waikato
-
3.
Massey University
-
4.
University of Erfurt
- 5. Centre National de la Recherche Scientifique
- 6. Roskilde University
-
7.
University of Groningen
-
8.
University of New England
-
9.
University of Oxford
- 10. Mango Languages
- 11. Germany Indonesia Professionals GmbH
- 12. University of the South Pacific
-
13.
University of Surrey
- 14. University of Oxford All Souls College
-
15.
Victoria University of Wellington
-
16.
University of Auckland
- 17. SIL Vanuatu
Description
VanBib: A database of references to resources on Vanuatu languages
Tihomir Rangelov & Eleanor Ridge (editing, curating, data collection, entry, annotation, cleanup)
References were also contributed by Alex François, Carsten Levisen, Charlotte Gooskens, Cindy Schneider, Daniel Krauße, Elizabeth Pearce, Fiona Willans, Jens Hopperdietzel, Marie-France Duhamel, Michael Franjieh, Mike Bryant, Miriam Meyerhoff, Noemie Severin and Ross Clark.
Hazel Ho helped with data entry, annotation and cleanup
version 1.2, 25 September 2025
===Intro===
The document VanBib-Vanuatu-references-1_2.tsv is a database of references to publications relevant to the languages of Vanuatu: all Indigenous Oceanic languages, Bislama, as well as English and French (as relevant to the Vanuatu context).
The file VanBib-Vanuatu-references-1_2.xlsx contains identical information in a format that can be explored in Microsoft Excel with less chance of conversion errors. Empty cells in the .xlsx file are left empty for readibility, they are tagged with 'NA' in the .tsv file.
VanBib’s aim is to help speakers, researchers, activists, and other interested parties to find information about the languages of Vanuatu. This database constitutes work in progress and is certainly not exhaustive, but we hope it can act as a first step in such endeavours. Currently, the database contains 3562 entries. These are not all unique references. Some references may be listed more than once because they may have been tagged for different languages on different rows (see below). Around a thousand entries refer to short or long wordlists for different languages/doculects, which are part of larger wordlist collections; these individual wordlists may or may not be considered as separate works/publications. Given the latter two points, the database contains at least 1500+ unique references.
We have tried to include as much information as possible for each reference. Besides standard fields, such as author, year, title, editors, publisher, URL, etc., VanBib also has fields for relevant language(s), as well as type of reference (see below).
VanBib’s utility is that users can filter/sort/search the references in order to obtain a list of references for specific languages, topics, authors, years, etc.
It is clear that this is not a complete list of all relevant works, as many works may not be publicly available, or may not have been catalogued and/or digitised. There are still many relevant works that are hiding in libraries, archives, hard drives, memory sticks, notebooks and loose pieces of paper. Furthermore, some works may have restricted access for various reasons, including the considerations and interests of speaker communities.
In other words, we reiterate that we do not make any claims for completeness. This should be viewed as work in progress and we expect this database to be updated regularly, as more submissions of both old and new references are made. See below how you can contribute.
Version 1.2 of this database was used in the analysis of linguistic work in Vanuatu by Rangelov, Ridge & Takau (in press); see also §2 of the main text of that chapter for more details.
The database has been curated and edited by Tihomir Rangelov and Eleanor Ridge. Hazel Ho helped with data cleanup and entering semi-automatically processed entries from Lynch & Crowley (2001) and data from the Global Bible Catalogue.
Work on each entry is acknowledged in the added_by column of the database (TR = Tihomir Rangelov, ER = Eleanor Ridge, HH = Hazel Ho). In late 2024 we issued a call for colleagues to add missing entries to the database (https://groups.google.com/g/vanuatu-languages/c/95fvq6bkIgQ) The colleagues who kindly responded with new entries are acknowledged by full name in the added_by column. These are: Alex François, Carsten Levisen, Charlotte Gooskens, Cindy Schneider, Daniel Krauße, Elizabeth Pearce, Fiona Willans, Jens Hopperdietzel, Marie-France Duhamel, Michael Franjieh, Miriam Meyerhoff, Noemie Severin and Ross Clark. We also thank Mike Bryant for providing many references to Bible translations.
===Where the references come from===
The references in VanBib come from various sources (indicated in the column reference_source):
* Glottolog's (Hammarström et al. 2024) list of references tagged with a glottocode for a Vanuatu Oceanic language or Bislama. Some errors in those references have been corrected manually e.g. some references that were tagged for Vanuatu language erroneously were removed; these were mostly tagged with the languages Mavea, Tolomako and Bierebo.
* The 24-page reference list in Lynch & Crowley (2001). We digitised and formatted these data semi-automatically. We manually checked all of them to ensure that all information is in the relevant columns. Some errors inevitably remain, but the most important data points (author, year and title) should be reliably in their right places.
* The lists of works written in Vanuatu languages from Lynch & Crowley (2001). In their description of individual languages, Lynch and Crowley provide a list of works written in Vanuatu's languages. These are mostly scripture translations, literacy materials and similar. These were OCRed, entered mostly manually and tagged for language.
* References contributed by the authors of the database and various other colleagues (see above).
===Important notes===
VanBib’s users should keep in mind the following:
1. Some references are listed more than once. This is because a work may relate to more than one language, and we have aimed to have a separate listing for each relevant language. This is an approach used in the Glottolog database and we have adopted it for the other references too. In some cases, the same work may be listed more than once and tagged for different languages on each row. In other cases, a work that is relevant to more than one language, may list the relevant Glottocodes separated by comma on the same row.
2. The languages. Most references are tagged with the glottocode (Hammarström et al. 2024) of a language or languages associate with it. The glottocode may be for a Glottolog "language" or "dialect".
For better readability, a language name is listed in the language_name column. The linguonyms in this column are based on Rangelov & Ridge (2025), i.e. a name in this column may differ from the default Glottolog language name where there may be a different preferred name, e.g. due to community preferences.
In the language_name_source column, we have aimed to include the linguonym that appears in the source, as much as possible. If we were unable to verify this, the value in this column may default to the Glottolog language name for sources imported from Glottolog, or be left empty.
Alternative linguonyms and other metadata from Rangelov & Ridge (2025) are listed in the rightmost columns to facilitate filtering, sorting and searching the database.
Some works that deal with all or most Vanuatu/Oceanic/Austronesian/Pacific/etc. languages may not have been tagged with a Glottocode at all, or have a Glottocode for an internal node on the language tree (i.e. a sub-family of languages).
3. Hhtypes and hhsubtypes. We adopted Glottolog's hhtype to indicate the topic of a work. We also introduced 'hhsubtypes' which we deemed necessary for our work on Rangelov, Ridge & Takau (in press). The following are Glottolog hhtypes, which also appear in VanBib: grammar, grammar_sketch, phonology, dictionary, wordlist, specific_feature, socling, text, comparative, ethnographic, dataset, corpus, bibliographical, dialectology, minimal, overview. We have also added the hhtype 'mm_corpus', which stands for 'multimedia corpus', i.e. a language corpus that consists of audio and/or video recordings, and the hhtype 'bible' which stands for Bible translations. For some references from Glottolog, we have changed the inherited hhtype tags. In some cases, we did this to fix errors. In other cases, we changed the tags to fit our criteria, which may have differed from Glottolog's (see also Rangelov, Ridge & Takau in press, §2). This is relevant to the following cases:
- grammars: our definition of 'grammar' is a reference work that covers at least the phonology, morphology, and syntax of a language and has a length of at least 250 pages.
- grammar_sketches: hhsubtype 'long' for grammatical descriptions of 100-250 pages, and hhsubtype 'short' for such works of less than 100 pages.
- dictionaries: those works that provide at least some examples and grammatical information, usually longer than a few hundred entries.
- wordlists: other descriptions of the lexicon that do not provide substantial detail, other than words/phrases and glosses. These are tagged with the hhsubtype 'long' when they have more than 500 entries, and 'short' when they have less than 500 entries.
Additionally, for some hhtypes, which usually constitute other type of linguistic research (e.g. the hhtypes specific_feature, socling, phonology, dialectology, comparative), we have included an hhsubtype that suggests the size of the work/depth of analysis). Such hhsubtypes are: monograph, thesis, article, chapter, presentation. These largely overlap with the entry_type value.
For references tagged with the hhtype mm_corpus, we use the following hhsubtypes to reflect annotation levels: 'none' (no time-aligned annotations), 'transcription' (time-aligned transcription only), 'translation' (time-aligned translation, usually on top of transcription), 'interlinearisation' (interlinearised glossing).
For references tagged with the hhtype bible, the following hhsubtypes have been used: full_bible, bible_nt (New Testament only), bible_nt_audio (new testament and audio), bible_in_progress (Bible translation in progress), bible_part (only part of a testament).
4. Inevitably, there are remaining errors in the database, either inherited from the original sources, introduced during semi-automatic data entry or introduced by mistake. For some columns more efforts are needed to enter relevant tags, e.g. url, doi, pages, inlg (the language in which the work was written). We have prioritised correcting inherited or introduced errors in the following columns: glottocode, hhtype, hhsubtype, author, year, title.
It is our hope that the authors of works, or experts on specific languages and topics, will engage with this database by adding references and correcting errors.
Please suggest edits and additions by leaving a comment in a relevant cell in the online spreadsheet collaborative document where this database is continously updated between version releases. Alternatively, you can email the authors.
The two files in this record were generated from the underlying collaborative online spreadsheet using this R code.
===References===
Hammarström, Harald & Forkel, Robert & Haspelmath, Martin & Bank, Sebastian. 2024. Glottolog 5.1. Leipzig: Max Planck Institute for Evolutionary Anthropology. https://doi.org/10.5281/zenodo.14006617
(Available online at http://glottolog.org, Accessed on 2025-05-09.)
Lynch, John & Crowley, Terry. 2001. Languages of Vanuatu: A new survey and bibliography. Canberra: Pacific Linguistics.
Rangelov, Tihomir & Ridge, Eleanor & Takau, Lana. in press. Linguistics in Vanuatu 45 years after Independence. To appear in the Special issue on Vanuatu languages of Te Reo: The Journal of the Linguistic Society of New Zealand.
Rangelov, Tihomir & Ridge, Eleanor. 2025. A database of Vanuatu language names, version 1.1. Zenodo. https://doi.org/10.5281/zenodo.17198232
Files
Files
(2.6 MB)
Name | Size | Download all |
---|---|---|
md5:04ab93ee2585e899065642529bf6aedf
|
1.9 MB | Download |
md5:ac3b3022a50d8d8831f7820c08252344
|
632.7 kB | Download |
Additional details
Related works
- Cites
- Dataset: 10.5281/zenodo.17198232 (DOI)
- Has metadata
- Dataset: 10.5281/zenodo.17198232 (DOI)
- Is compiled by
- Computational notebook: 10.5281/zenodo.17204293 (DOI)