Dataset Open Access
DING, Chenchen; UTIYAMA, Masao; SUMITA, Eiichiro
This is the Khmer ALT of the Asian Language Treebank (ALT) Corpus. English texts sampled from English Wikinews were available under a Creative Commons Attribution 2.5 License.
Please refer to
for an introduction of the ALT project.
Khmer ALT has been developed by NICT and NIPTICT. The license of Khmer ALT is
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License
- data_km.km-[tok|tag].nova : tokenized/POS-tagged Khmer sentences by the nova annotation system
# based on the following two guildelines
 NICT bears no responsibility for the contents of the corpus and the lexicon and assumes no liability for any direct or indirect damage or loss whatsoever that may be incurred as a result of using the corpus or the lexicon.
 If any copyright infringement or other problems are found in the corpus or the lexicon, please contact us at alt-info[at]khn[dot]nict[dot]go[dot]jp. We will review the issue and undertake appropriate measures when needed.