Published August 15, 2025 | Version v4
Dataset Open

KPoEM dataset

Description

KPoEM (Korean Poetry Emotion Mapping) Dataset

 

We collected 483 poems from five prominent Korean modern poets: Kim So-wol (165 poems), Yun Dong-ju (113 poems), Im Hwa (44 poems), Yi Sang (47 poems), and Han Yong-un (118 poems). We scraped their major works from public domain sources, including "Azaleas" (Korean: 진달래꽃, Jindallaekkot), "Sky, Wind, Stars, and Poetry" (Korean: 하늘과 바람과 별과 시, Haneulgwa baramgwa byeolgwa si), "Hyunhaetan" (Korean: 현해탄, Hyeonhaetan; Hanja: 玄海灘) , "Silence of My Beloved" (Korean: 님의 침묵, Nimui chimmuk), In the case of Yi Sang, the dataset focuses on his Korean-language series poems such as "Crow's Eye View" (Korean: 오감도, Ogamdo; Hanja: 烏瞰圖), "Reverse" (Korean: 역단, Yeokdan; Hanja: 易斷), and "Critical Condition" (Korean: 위독, Widok; Hanja: 危篤).

In results, this dataset has a total of 7,007 emotion-annotated line-level text.

Revision: In v4, the names of some columns that were originally in Korean were changed to English.

 
  • Please refer to the following presentation: LIM, I., Ji, H., & Kim, B. (2025). 한국 근현대시 감정 라벨링 데이터셋 구축: 문학 텍스트의 컴퓨터 기반 감정 분류와 생성형 AI 활용을 위한 기초 연구. 제2회 한국현대문학자대회 (KorLitConf), Seoul, Korea. Zenodo. https://doi.org/10.5281/zenodo.15055795
  • All source code used in this study has been made publicly available in the following repository. See the link for details. https://github.com/AKS-DHLAB/KPoEM

Files

Files (3.0 MB)

Name Size Download all
md5:93f3cf7f8ff8e42393aad3250af00430
2.4 MB Download
md5:d4ddfafbf7fa5e3c309f820376776a6e
638.4 kB Download

Additional details

Related works

Continues
Presentation: 10.5281/zenodo.15055795 (DOI)

Funding

Academy of Korean Studies
Development of Advanced Natural Language Processing and Large Language Model-Based Digital Korean Studies and Education Methodology AKSR2025-RE04

Dates

Issued
2025-04-09
The date of v1 release

Software

Repository URL
https://github.com/AKS-DHLAB/KPoEM
Programming language
TSV , Python
Development Status
Active