Dataset Open Access

A part-of-speech (POS) lexicon of Classical Tibetan for NLP

Hill, Nathan W.; Garrett, Edward

JSON-LD ( Export

  "description": "<p>This part-of-speech (POS) lexicon of Classical Tibetan was prepared in the course of the research project &#39;Tibetan in Digital Communication&#39; (2012-2015) hosted at SOAS, University of London and funded by the UK&#39;s Arts and Humanities Research Council (grant code: AH/J00152X/1). The data for verbs comes from a digitized version of <em>A Lexicon of Tibetan Verb Stems as Reported by the Grammatical Tradition</em> (Munich: Bayerische Akademie der Wissenschaften, 2010) by Nathan W. Hill. Otherwise data comes from the manually part-of-speech tagged training data produced by the corpus and a few lexical items specifically added by hand to improve rule based tagging.</p>", 
  "license": "", 
  "creator": [
      "affiliation": "SOAS, Univeristy of London", 
      "@id": "", 
      "@type": "Person", 
      "name": "Hill, Nathan W."
      "affiliation": "SOAS, University of London", 
      "@id": "", 
      "@type": "Person", 
      "name": "Garrett, Edward"
  "url": "", 
  "datePublished": "2017-05-11", 
  "keywords": [
    "Tibetan language", 
    "Natural language processing", 
    "part-of-speech tagging"
  "@context": "", 
  "distribution": [
      "contentUrl": "", 
      "encodingFormat": "zip", 
      "@type": "DataDownload"
  "identifier": "", 
  "@id": "", 
  "@type": "Dataset", 
  "name": "A part-of-speech (POS) lexicon of Classical Tibetan for NLP"
All versions This version
Views 278278
Downloads 9393
Data volume 8.2 MB8.2 MB
Unique views 258258
Unique downloads 9292


Cite as