There is a newer version of the record available.

Published April 9, 2025 | Version v1
Dataset Open

KPoEM dataset for "Decoding the Poetic Language of Emotion in Korean Modern Poetry: Insights from a Human-Labeled Dataset and AI Modeling"

Description

KPoEM (Korean Poetry Emotion Mapping) Dataset

 

We collected 487 poems from five prominent Korean modern poets: Kim So-wol (165 poems), Yun Dong-ju (113 poems), Im Hwa (44 poems), Yi Sang (47 poems), and Han Yong-un (118 poems). We scraped their major works from public domain sources, including "Azaleas" (Korean: 진달래꽃, Jindallaekkot), "Sky, Wind, Stars, and Poetry" (Korean: 하늘과 바람과 별과 시, Haneulgwa baramgwa byeolgwa si), "Hyunhaetan" (Korean: 현해탄, Hyeonhaetan; Hanja: 玄海灘) , "Silence of My Beloved" (Korean: 님의 침묵, Nimui chimmuk), In the case of Yi Sang, the dataset focuses on his Korean-language series poems such as "Crow's Eye View" (Korean: 오감도, Ogamdo; Hanja: 烏瞰圖), "Reverse" (Korean: 역단, Yeokdan; Hanja: 易斷), and "Critical Condition" (Korean: 위독, Widok; Hanja: 危篤).

In results, this dataset has a total of 7,008 emotion-annotated line-level text.

 

  • Please refer to the following presentation: LIM, I., Ji, H., & Kim, B. (2025). 한국 근현대시 감정 라벨링 데이터셋 구축: 문학 텍스트의 컴퓨터 기반 감정 분류와 생성형 AI 활용을 위한 기초 연구. 제2회 한국현대문학자대회 (KorLitConf), Seoul, Korea. Zenodo. https://doi.org/10.5281/zenodo.15055795
  • All source code used in this study has been made publicly available in the following repository. See the link for details. https://github.com/AKS-DHLAB/KPoEM

Files

Files (1.7 MB)

Name Size Download all
md5:9be9fef58aa34e00d72e9aac6028e4d1
1.7 MB Download

Additional details

Related works

Continues
Presentation: 10.5281/zenodo.15055795 (DOI)

Funding

Academy of Korean Studies
Development of Advanced Natural Language Processing and Large Language Model-Based Digital Korean Studies and Education Methodology AKSR2025-RE04

Dates

Issued
2025-04-09
The date of v1 release

Software

Programming language
TSV
Development Status
Active