childTale-A: A corpus of eighty fairy tales from the 7th edition by the Brothers Grimm, manually annotated for textually encoded emotions
Creators
- 1. Freie Universität Berlin
- 2. University of Bielefeld
Description
The childTale-A corpus is a collection of eighty fairy tales, a core set of the Grimms’ Children's and Household Tales as introduced in Herrmann & Lüdtke (2023). Within the CHYLSA project, textually encoded emotions were annotated in each sentence in each of the eighty fairy tales. Annotations were collected for the dimensions valence and arousal, as well as for the six basic emotions anger, disgust, fear, joy, sadness, and surprise. Each fairy tale was annotated by two persons (for details see Hermann & Lüdtke, 2023).
In detail, this dataset contains:
- instructions and texts used for annotation (zip-files):
- all N=80 fairy tales as txt-files (with normalised orthography)
- instructions (in German) for the valence and arousal annotation as well as for the annotation of the six basic emotions
- scripts for preparing and analysing annotations for textually encoded emotions:
- Python scripts to prepare Excel files for annotation (including a script to separate texts into individual sentences)
- R-scripts for data preparation, calculation of inter-rater reliability and smoothing of the valence annotations (discrete cosine transformation (DCT) with length normalisation)
- sentence-level data (for each sentence in each of the eighty fairy tales):
- annotations for the dimensions valence and arousal (continuous values)
- categorisation of each sentence (as negative, neutral or positive) based on the continuous valence annotation
- annotations on the occurrence of the six basic emotions anger, disgust, fear, joy, sadness, and surprise
- transformed and length-normalised valence annotations as basis for the Emotional Arcs (DCT and length normalisation results in one hundred data points for each fairy tale, saved in a separate data file)
- text-level data (for each of the eighty fairy tales):
- general information, for example title in German and English, number of sentences, Kinder- und Hausmärchen-ID (KMH-ID), corpus ID
- results of the analysis of the annotated data with values for:
- Average Valence and Average Arousal
- inter-rater reliability index (Krippendorff's alpha coefficient) for the valence and arousal annotations
- proportions of positive, negative and neutral sentences
- Emotion Potential (percent of both positive and negative sentences)
- Valence Span, Arousal Span and range of the Emotional Arc
- Emotion Profile (relative frequency for each of the six basic emotions anger, disgust, fear, joy, sadness, and surprise)
- inter-rater reliability indices (Krippendorff's alpha coefficient and the percentage of agreement) for each of the six basic emotions
Reference:
Herrmann, Berenike & Lüdtke, Jana (2023). A Fairy Tale Gold Standard. Annotation and Analysis of Emotions in the Children's and Household Tales by the Brothers Grimm. Zeitschrift für digitale Geisteswissenschaften (ZfdG). DOI: 10.17175/2023_005.
DFG Schwerpunktprogramm SPP 2207 “Computational Literary Studies“
Online:
Teilprojekt: „CHYLSA - Children’s and Youth Literature Sentiment Analysis“
Online:
Files
ChildTale-A_FairyTales_txt_files.zip
Files
(2.1 MB)
Name | Size | Download all |
---|---|---|
md5:ee54d04b5c2782381ccc98e233ba92d9
|
364.3 kB | Download |
md5:a43776c9641f4cf4eac990d3fbd375ec
|
1.1 MB | Download |
md5:aaed19c30955a5ab5e88683ea9e71cf5
|
45.2 kB | Download |
md5:3b743d9a13b5ba5bc16e6c83c5d11d70
|
371.1 kB | Preview Download |
md5:e7b314735f952ff2671095a974537448
|
147.7 kB | Preview Download |
md5:1ebf5bc08f472da1780b8b319fed16c1
|
3.4 kB | Download |
md5:f86c735a7d94c90d8f20987c75b89f36
|
7.8 kB | Download |
md5:00b131249c456132eba0dc3d59e23bf1
|
22.6 kB | Download |
md5:4f00a108c95ce31b437ee6c4db0be25c
|
16.6 kB | Download |
md5:b77a7b514e83c14f845ebf9b2946a600
|
3.5 kB | Download |
md5:b77a7b514e83c14f845ebf9b2946a600
|
3.5 kB | Download |