Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published July 30, 2023 | Version 1.0
Dataset Open

Han-solo: Thai syllable segmenter

  • 1. School of Information Science and Technology, VISTEC, Thailand

Description

This dataset is a Thai syllable corpus for the Thai social media domain from Wisesight Sentiment Corpus.

  • Train: 794 lines
  • Test: 199 lines
  • Total: 993 lines

This dataset is a part of the PyThaiNLP project.

Files

han_solo_test.txt

Files (226.1 kB)

Name Size Download all
md5:0015f04654bf1017b9f830650e6ff390
41.0 kB Preview Download
md5:7aee890bc99565b8a7e043c64038f24e
185.1 kB Preview Download