Dataset Open Access

Polynesian Segmented Data

Walworth, Mary

Presented here are segmented data from 210 basic vocabulary concepts in 31 Polynesian languages. This data was extracted from the Austronesian Basic Vocabulary Database (Greenhill et al 2008) and then verified and edited for accuracy, relying on both my knowledge of Polynesian languages and existing source material from published dictionaries and grammars (see references below). Once the data was verified, the numerous transcription conventions used in the original data were converted into consistent IPA forms using existing phonetic and phonological descriptions of these languages. Once the data were in a consistent and reliable format, I went through each of the 210 concepts and segmented the lexical items into morphemes using EDICTOR ( The purpose of the segmentation task was to isolate base forms by indicating affixes, reduplication, and compounded words, in order to optimise cognate detection. This process also resulted in the addition of partial cognates; resulting in a richer, more comprehensive data set.

Language codes in the dataset refer to the following languages:

Anuta_253                       Anuta

Austral_1213                    Reo Ra’ivavae

Austral_128                      Rurutuan

EastFutuna_210               East_Futunan

Emae_1030                     Emae

FutunaAniwa_156            Futuna-Aniwa

Hawaiian_52                    Hawaiian

Kapingamarangi_217      Kapingamarangi

Luangiua_238                 Luangiua

Mangareva_239               Reo Mangareva

Maori_85                         Maori

Mele-Fila_1163               Mele_Fila

Niuean_247                     Niuean

NorthMarquesan_38         Marquesan

Nukuria_1212                   Nukeria

Penrhyn_235                   Penrhyn

Polynesian_658               Proto-Polynesian

Pukapuka_152                 Pukapuka

RakahangaManihiki_589 Rakahanga-Manihiki

Rapanui_264                   Rapanui

Rarotongan_58                Rarotongan

RennellBellona_206        Rennell-Bellona

Samoan_118                   Samoan

Sikaiana_243                   Sikaiana

Tahitian_173                   Tahitian

Tikopia_155                     Tikopia

TongaTongaIslands_136 Tongan

Tuamotuan_246              Tuamotuan

Tuvalu_753                     Tuvalu

VaeakauTaumako_375    Vaeakau Taumako

Wallisian_258                  Wallisian



(2017). Dictionary of Cook Island Languages.

(2017). Fare Vāna’a Dictionary.

(2017). nā puke wehewehe ʻōlelo Hawai‘i.

Buse, J. (1996). Cook Islands Maori dictionary with English-Cook Islands Maori finderlist. Canberra, Pacific Linguistics.

Capell, A. (1962). The Polynesian Language of Mae (Emwae), New Hebrides. Auckland, Linguistic Society of New Zealand.

Churchill, W. (1912). Easter Island: the Rapanui speech and the peopling of southeast Polynesia. Washington, DC, The Carnegie Institution of Washington.

Clark, R. (1998). A dictionary of the Mele language (Atara Imere), Vanuatu. Canberra, Pacific Linguistics.

Donner, W. (1987) Sikaiana Vocabulary: Na male ma na talatala o Sikaiana. Honiara, Solomon Island

Dordillon, M. R. I. (1931-32). Grammaire et Dictionnaire de la Langue des Iles Marquises. Paris, Institut d'Ethnologie.

Dougherty, J. W. D. (1983). West Futuna-Aniwa: An Introduction to a Polynesian Outlier Language. Berkeley, University of California Press.

Elbert, S. H. (1975-81). Dictionary of the Language of Rennell and Bellona. Copenhagen, The National Museum of Denmark.

Feinberg, R. (1977). The Anutan language reconsidered: Lexicon and grammar of a Polynesian Outlier. New Haven, Human Relations Area Files Press.

Firth, R. (1985). Tikopia-English Dictionary/Taranga Fakatikopia ma Taranga Fakainglisi. Auckland, Auckland University Press

Greenhill S.J. & R. Clark. (2011). POLLEX-Online: The Polynesian Lexicon Project Online. Oceanic Linguistics, 50(2), 551-559.

Greenhill, S.J., Blust. R, & Gray, R.D. (2008). The Austronesian Basic Vocabulary Database: From Bioinformatics to Lexomics. Evolutionary Bioinformatics, 4:271-283.

Hollyman, K. J. (1987). De Muna Fagauvea I: Dictionnaire fagauvea-français. Auckland, Linguistic Society of New Zealand.

Jackson, G. (2001). Tuvaluan Dictionary. Oceania Printers. Suva, Fiji.

Janeau, V.F. (1908). Essai de grammaire de la langue des îles Gambier ou Mangaréva. Paris, Chadenat.

Kieviet, P. (2017): A Grammar of Rapa Nui. Berlin, Language Science Press.

Lieber, M. D. and K. H. Dikepa (1974). Kapingamarangi Lexicon. Honolulu, University Press of Hawaii.

Milner, G. B. (1966). Samoan Dictionary. London, Oxford University Press.

Moorfield, J.C. (2017). Te Aka Online Māori Dictionary,

Moyse-Faurie, C. (1993). Dictionnaire futunien-français. Paris, Peeters.

Næss, A. & E. Hovdhaugen. (2011). A Grammar of Vaeakau-Taumako. Berlin, de Gruyter Mouton.

Pukui, M. K. and S. H. Elbert (1986). Hawaiian Dictionary. Honolulu, University of Hawaii Press

Salisbury, K. (2017). Personal communication on transcription.

Sperlich,W. (1997). Tohi Vagahai Niue. Honolulu, University of Hawai‘i Press.

Stimson, J. F. (1964). A Dictionary of Some Tuamotuan Dialects of the Polynesian Language. The Hague, Martinus Nijhoff.

Tregear, E. (1899). A Dictionary of Mangareva (or Gambier Islands). Wellington, Government Printer.

Walworth, M. Unpublished fieldnotes  on Reo Mangareva, 2015-2017

Walworth, M. Unpublished fieldnotes on Rurutuan, 2010.

Walworth, M. Unpublished fieldnotes on Reo Ra'ivavae, 2010-2012.

Weber, R.L. & N. Weber. (1995). Rapanui, in Darrell T.Tryon (ed), Comparative Austronesian Dictionary. Berlin, Mouton de Gruyter.



Files (2.0 MB)
Name Size
2.0 MB Download
All versions This version
Views 532533
Downloads 3838
Data volume 77.8 MB77.8 MB
Unique views 473474
Unique downloads 3636


Cite as