T-X corpus
Description
NOTE: THIS VERSION HAS BEEN DEPRECATED (BUT ZENODO DOES NOT ALLOW USERS TO "UNPUBLISH" DATA SETS). PLEASE ENSURE YOU ARE USING THE LATEST VERSION: SEE "See all X version" IN THE MENU BAR ON THE RIGHT. This corpus includes the Taishō and Xuzangjing/Zokuzōkyō collections of Chinese Buddhist texts, as digitised by CBETA, processed so that they are ready for use with the text-analysis tool TACL or the TACL GUI.
NOTE: This corpus was modified in March 2023 to fix some problems in the way the TACL code was processing the CBETA XML. Those problems, and the corresponding fixes, are described in this document.
Files
T-X corpus.zip
Files
(739.9 MB)
Name | Size | Download all |
---|---|---|
md5:b0804ccd1fd52edd2b80c3a2d537a5b8
|
739.9 MB | Preview Download |