Visual Dictionary and Thesaurus of Buddhist Sanskrit
https://mangalamresearch.shinyapps.io/VisualDictionaryOfBuddhistSanskrit/

This Dataset is based on Lugli's Buddhist Sanskrit Corpus (DOI: 10.5281/zenodo.3457821):  https://zenodo.org/record/3903262

Annotated Lexical Dataset structure by column:

lemma = word used to retrive concordances
sID = sentence number. sentence is defined as either a danda delimited string or a stanza
page = the page number referring to the edition as found in the digitized version of the text. is the digitized edition has no page number, page = 0
ref = alphanumeric string with either page info or title+sID. we are switching to title+sID.
kwic = concordance line
transl = translation from published sources
domain = conceptual domain 
case_or_voice = grammatical case (noun) or voice/verb form (verbs)
number = grammatical number
sem.field = semantic or conceptual field. see Visual Dictioanry and Thesaurus of Buddhist Sanskrit's Documentation for details
sem.cat = semantic category. see Visual Dictioanry and Thesaurus of Buddhist Sanskrit's Documentation for details
sense = a sense label chosen by used
subsense = a fine-grained sense label chosen by us
sem.pros = semantic prosody. see Visual Dictioanry and Thesaurus of Buddhist Sanskrit's Documentation for details
uncertainty = flags semantic annotation as uncertain as specify the reason for the uncertainty. see Visual Dictioanry and Thesaurus of Buddhist Sanskrit's Documentation for details
cotext = list of word stem co-occurring together with the lemma int he concordance (automatically stemmed, may contain errors)
cols 17-31 = syntactic dependencies and other relations between the lemma and cotext items
PoS = part of speech
Status = whether the portion of the dataset devoted to a lemma needs revisions and when it has been last revised
sem.pros.head = the cotext item that drives the sem.pros annotation. used after summer 2020.
UID = unique sentence identifier
cols 35-44 = conceptual relations between the lemma and cotext items
Portrait = value 'yes' indicates that we have published a 'lexical portrait' for the lemma, the URL of the lexical portrait is https://mangalamresearch.shinyapps.io/LexicalPortrait_<Lemma>/ , e.g. https://mangalamresearch.shinyapps.io/LexicalPortrait_Vitarka/
ExScore = a number indicating how good a dictionary example a sentence makes.

Metadata file structure by column:

title = text title, sometimes accompanied by author or chapter info for disambiguation purposes
date_range = likely period of composition of a text
DateRangeStart & DateRangeEnd = the boundaries of likely period of composition specified in the date_range column
text.type = the text type or  "genre"
period = rough broad diachronic categorisation : foundational = ca I-III CE; classical = ca IV-V ; commentarial VI - . We will likely refine the periodization in future versions.
selection = the chapters or portion of the texts we included int he dictioanry sources. if blank, we used the full text.
tradition = tradition, scholastic affiliation