There is a newer version of the record available.

Published January 24, 2024 | Version v1.0
Dataset Open

Han Instruct Dataset

Description

Dataset Summary

🪿 Han (ห่าน or goose) Instruct Dataset is a Thai instruction dataset by PyThaiNLP. It collect the instruction following in Thai from many source.

Many question are collect from Reference desk at Thai wikipedia.

Data sources:

Supported Tasks and Leaderboards

  • ChatBot
  • Instruction Following

Languages

Thai

Dataset Structure

Data Fields

  • inputs: Question
  • targets: Answer

Considerations for Using the Data

The dataset can be biased by human annotators. You should check the dataset to select or remove an instruction before training the model or using it at your risk.

 

Licensing Information

CC-BY-SA 4.0

Files

han-instruct-dataset-v1.0.csv

Files (1.5 MB)

Name Size Download all
md5:4d27c03f5b114692c9a01c2b522b4c53
1.5 MB Preview Download