Published July 30, 2023
| Version 1.0
Dataset
Open
Han-solo: Thai syllable segmenter
Creators
- 1. School of Information Science and Technology, VISTEC, Thailand
Description
This dataset is a Thai syllable corpus for the Thai social media domain from Wisesight Sentiment Corpus.
- Train: 794 lines
- Test: 199 lines
- Total: 993 lines
This dataset is a part of the PyThaiNLP project.
Files
han_solo_test.txt
Files
(226.1 kB)
Name | Size | Download all |
---|---|---|
md5:0015f04654bf1017b9f830650e6ff390
|
41.0 kB | Preview Download |
md5:7aee890bc99565b8a7e043c64038f24e
|
185.1 kB | Preview Download |