Dataset Open Access

Vietic 116 item phylogenetic lexicon

Sidwell, Paul; Alves, Mark


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nmm##2200000uu#4500</leader>
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Vietic, Austroasiatic, Swadesh list, phylogentics, lexicostatistics, Nexus file</subfield>
  </datafield>
  <controlfield tag="005">20220221084300.0</controlfield>
  <datafield tag="500" ind1=" " ind2=" ">
    <subfield code="a">The dataset was created for a paper provisionally entitled "The Vietic Languages: A Phylogenetic Analysis". The paper is submitted for journal publication and a version submitted for presentation at ICAAL9, November 2021. We encourage sharing for the purpose of testing/reproducing results, and augmented or derived studies under Creative Commons Attribution licence.</subfield>
  </datafield>
  <controlfield tag="001">5263195</controlfield>
  <datafield tag="711" ind1=" " ind2=" ">
    <subfield code="d">18-19 November 2021</subfield>
    <subfield code="g">ICAAL 9</subfield>
    <subfield code="a">9th International Conference on Austroasiatic Linguistics</subfield>
    <subfield code="c">Lund, Sweden</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Montgomery College</subfield>
    <subfield code="0">(orcid)0000-0001-7055-9182</subfield>
    <subfield code="a">Alves, Mark</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">199783</subfield>
    <subfield code="z">md5:913cb924c3c879c9047c9ba67bc374e0</subfield>
    <subfield code="u">https://zenodo.org/record/5263195/files/Vietic116lexicon.sources.nexus-file.xlsx</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="y">Conference website</subfield>
    <subfield code="u">https://sites.google.com/site/icaalprojects/icaal-meetings/icaal-9-2021</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2021-08-26</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire_data</subfield>
    <subfield code="p">user-dighl</subfield>
    <subfield code="o">oai:zenodo.org:5263195</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">University of Sydney</subfield>
    <subfield code="0">(orcid)0000-0002-9162-5668</subfield>
    <subfield code="a">Sidwell, Paul</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Vietic 116 item phylogenetic lexicon</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-digling</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;The file is a116 item lexicostatistical dataset for classification of the Vietic languages. The set includes 30 Vietic doculects, Proto-Vietic, plus Khmu and Jahai as out-groups. Included is a listing of the sources, and the NEXUS file with our cognate value assignments, which we created to run on SplitsTree to generate phylograms and NeighborNets. The 116-item list was the outcome of beginning with the Swadesh 100 and 200 lists and reconciling these with the available data with the aim of achieving at least 80% coverage for each lect in the analysis. Procedurally,&amp;nbsp;sources were selected and lexicons aggregated in a spreadsheet, with rows identified with Swadesh 100 and 200 items, subject to semantic and phonological adjustments as we judged necessary. For most of the languages, full coverage of the Swadesh 100 categories was not possible, with 20 or more gaps being common. Some 40 additional categories were added from the Swadesh 200 list, based on the 40 best represented items in the aggregated data, seeking to achieve a 120-item list with at least 100 items coverage for all lects, ultimately settling on 116 items.&lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.5263194</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.5263195</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
  </datafield>
</record>
138
41
views
downloads
All versions This version
Views 138138
Downloads 4141
Data volume 8.2 MB8.2 MB
Unique views 124124
Unique downloads 3838

Share

Cite as