Published April 29, 2023 | Version v1
Dataset Open

Wordlist files of lexical data from Papua New Guinea and western Solomons Oceanic languages collated for Ross's 1986 PhD thesis and 1988 publication thereof

Authors/Creators

  • 1. ANU

Description

It occurs to me that the files containing Western Oceanic lexical data that I collected in the late 70s/early 80s for my PhD (Ross 1988) might be useful to someone. They are also used in the volumes of The lexicon of Proto Oceanic (Ross, Pawley & Osmond 1998, 2003, 2011, 2016, 2023). In any case, it is right that they be made publicly available, something that wasn't so easy back then. Most of the material is from wordlists that I collected during fieldwork in Papua New Guinea from around 1978 to 1982. The file cor06 is omitted because it contains SE Solomonic data (outside Western Oceanic) drawn from Tryon & Hackman 1983.

I keyed the data into text files in a format such that each line was the entry for a single word, and each field within an entry was marked by a backslash code (I adapted this format from SIL's conventions at the time), then arranged them in cognate sets, each set separated from the next by an empty line. This work was done between 1983 and 1985, when text files were the best way to store data. They were entered on a terminal connected to a mainframe computer at the ANU. I have converted the ASCII symbols used in the original files into UTF-8 here in the interests of readability. The conversion was largely automatic, and I have not done a full check of each file, so there may be glitches.

Each file contains languages from a region, as listed below (and the regions sometimes cut across subgroups determined by the comparative method). Three-letter abbreviations are used for language names, and two key files are also provided, one (COR-abbrevs) ordered by regions (determined by the numerals that start each line), the other by alphabetical order of language name (COR-abbrevs-alph). Some three-letter codes are followed by a hyphen and an extra letter. These are dialects. For example, MUM stands for Mumeng and MUM-P for the Patep dialect of Mumeng.

Data files are labelled with COR (for 'correspondence sets') plus a numeral. The numerals are: 1-3 New Ireland; 4 Willaumez Peninsula (New Britain) area; 5 NW Solomonic; 7+8 Papuan Tip; 9 Vitiaz Strait area and NG north coast; 10 Huon Gulf and Markham Valley; 11 South and west New Britain. 7+8 are partial only. When I keyed the files, I had to rely on a mainframe's nightly back-up onto tape spools. One night the system failed, and so did the restore, and I lost some data.

The backslash codes in the data files are: \l language; \p protolanguage; \w word; \g gloss; \n note; \s source. The formatting of these files is a little odd, since they served as input to routines I wrote to pull out sound correspondences. Anything after '%' is the elicited form: what immediately precedes '%' has had something 'undone', e.g. metathesis.

The orthography of the files is phonemic and largely obvious. The conventions are set out in the introductions to the volumes of The lexicon of Proto Oceanic.

Finally, the files also contain reconstructions at various interstages at the top of a cognate set. These were inserted for heuristic reasons during my research. Many of them did not survive into my PhD thesis, and they should preferably be ignored. The reader who is interested in current Oceanic reconstructions should turn to the volumes of The lexicon of Proto Oceanic.

Files

COR-abbrevs-alph.txt

Files (3.3 MB)

Name Size Download all
md5:c4258504066473408a030122ae08105b
7.8 kB Preview Download
md5:44956e41504f69071575bdf489057cb6
7.8 kB Preview Download
md5:598932c2cc6c35c1196e9bb5e5386a1b
123.5 kB Preview Download
md5:bc4ba497c5f091d3786bf5c428690f8d
177.5 kB Preview Download
md5:97871ca2f55157f703b5df89d45f2d08
156.8 kB Preview Download
md5:c7ff1d9c7555a340a6460dd8e60258af
169.4 kB Preview Download
md5:6bb67cc8ca2e79e1e3770d71374b5869
502.8 kB Preview Download
md5:6d2d57f9d749983b280935da44acdeef
394.0 kB Preview Download
md5:ae696bd7d9e44982184e9ab4946472dc
771.0 kB Preview Download
md5:38b02ab6dda51e71e9bda19737335eb5
690.6 kB Preview Download
md5:271f799d623010430bc3460ccd586a30
323.9 kB Preview Download

Additional details

References

  • Ross, Malcolm (1988). Proto Oceanic and the Austronesian languages of western Melanesia. Canberra: Pacific Linguistics. (Pacific Linguistics C-98)
  • Tryon, Darrell & B.D. Hackman (1983). Solomon Islands languages: an internal classification. Canberra: Pacific Linguistics. (Pacific Linguistics C-72)
  • Ross, Malcolm, Andrew Pawley & Meredith Osmond, eds (1998). The lexicon of Proto Oceanic: The culture and environment of ancestral Oceanic society, 1: Material culture. Canberra: Pacific Linguistics. (Pacific Linguistics C-152) (https://openresearch-repository.anu.edu.au/handle/1885/106908)
  • Ross, Malcolm, Andrew Pawley & Meredith Osmond, eds (2003). The lexicon of Proto Oceanic, 2.:The physical environment. Canberra: Pacific Linguistics. (Pacific Linguistics 545) https://openresearch-repository.anu.edu.au/handle/1885/106908
  • Ross, Malcolm, Andrew Pawley & Meredith Osmond, eds (2008). The lexicon of Proto Oceanic: The culture and environment of ancestral Oceanic society, 3: Plants, 53–84. Canberra: Pacific Linguistics. (Pacific Linguistics 599) https://openresearch-repository.anu.edu.au/handle/1885/106908
  • Ross, Malcolm, Andrew Pawley & Meredith Osmond, eds (2011). he lexicon of Proto Oceanic: The culture and environment of ancestral Oceanic society, 4: Animals. Canberra: Pacific Linguistics. (Pacific Linguistics 621) https://openresearch-repository.anu.edu.au/handle/1885/106908
  • Ross, Malcolm, Andrew Pawley & Meredith Osmond, eds (2016). The lexicon of Proto Oceanic: The culture and environment of ancestral Oceanic society, 5: People: Body and mind. Canberra: Pacific Linguistics. (Asia-Pacific Linguistics 28)
  • Ross, Malcolm, Andrew Pawley & Meredith Osmond, eds (2023). The lexicon of Proto Oceanic: The culture and environment of ancestral Oceanic society, 5: People: Society. Canberra: Dept of Linguistics, ANU College of Asia and the Pacific, The Australian National University.