Published May 25, 2023 | Version 1.0
Dataset Open

childTale-A: A corpus of eighty fairy tales from the 7th edition by the Brothers Grimm, manually annotated for textually encoded emotions

  • 1. Freie Universität Berlin
  • 2. University of Bielefeld

Description

The childTale-A corpus is a collection of eighty fairy tales, a core set of the Grimms’ Children's and Household Tales as introduced in Herrmann & Lüdtke (2023). Within the CHYLSA project, textually encoded emotions were annotated in each sentence in each of the eighty fairy tales. Annotations were collected for the dimensions valence and arousal, as well as for the six basic emotions anger, disgust, fear, joy, sadness, and surprise. Each fairy tale was annotated by two persons (for details see Hermann & Lüdtke, 2023).

In detail, this dataset contains:

  • instructions and texts used for annotation (zip-files):
    • all N=80 fairy tales as txt-files (with normalised orthography)
    • instructions (in German) for the valence and arousal annotation as well as for the annotation of the six basic emotions
  • scripts for preparing and analysing annotations for textually encoded emotions:
    • Python scripts to prepare Excel files for annotation (including a script to separate texts into individual sentences)
    • R-scripts for data preparation, calculation of inter-rater reliability and smoothing of the valence annotations (discrete cosine transformation (DCT) with length normalisation)
  • sentence-level data (for each sentence in each of the eighty fairy tales):
    • annotations for the dimensions valence and arousal (continuous values)
    • categorisation of each sentence (as negative, neutral or positive) based on the continuous valence annotation
    • annotations on the occurrence of the six basic emotions anger, disgust, fear, joy, sadness, and surprise
  • transformed and length-normalised valence annotations as basis for the Emotional Arcs (DCT and length normalisation results in one hundred data points for each fairy tale, saved in a separate data file)
  • text-level data (for each of the eighty fairy tales):
    • general information, for example title in German and English, number of sentences, Kinder- und Hausmärchen-ID (KMH-ID), corpus ID
    • results of the analysis of the annotated data with values for:
      • Average Valence and Average Arousal
      • inter-rater reliability index (Krippendorff's alpha coefficient) for the valence and arousal annotations
      • proportions of positive, negative and neutral sentences
      • Emotion Potential (percent of both positive and negative sentences)
      • Valence Span, Arousal Span and range of the Emotional Arc
      • Emotion Profile (relative frequency for each of the six basic emotions anger, disgust, fear, joy, sadness, and surprise)
      • inter-rater reliability indices (Krippendorff's alpha coefficient and the percentage of agreement) for each of the six basic emotions

Reference: 
Herrmann, Berenike & Lüdtke, Jana (2023). A Fairy Tale Gold Standard. Annotation and Analysis of Emotions in the Children's and Household Tales by the Brothers Grimm. Zeitschrift für digitale Geisteswissenschaften (ZfdG). DOI: 10.17175/2023_005.

 

DFG Schwerpunktprogramm SPP 2207 “Computational Literary Studies“
Online:

  1. https://gepris.dfg.de/gepris/projekt/402743989
  2. https://dfg-spp-cls.github.io/

Teilprojekt: „CHYLSA - Children’s and Youth Literature Sentiment Analysis“

Online:

  1. https://gepris.dfg.de/gepris/projekt/424250469
  2. https://dfg-spp-cls.github.io/projects_en/2020/01/24/TP-CHYLSA/

Files

ChildTale-A_FairyTales_txt_files.zip

Files (2.1 MB)

Name Size Download all
md5:ee54d04b5c2782381ccc98e233ba92d9
364.3 kB Download
md5:a43776c9641f4cf4eac990d3fbd375ec
1.1 MB Download
md5:aaed19c30955a5ab5e88683ea9e71cf5
45.2 kB Download
md5:3b743d9a13b5ba5bc16e6c83c5d11d70
371.1 kB Preview Download
md5:e7b314735f952ff2671095a974537448
147.7 kB Preview Download
md5:1ebf5bc08f472da1780b8b319fed16c1
3.4 kB Download
md5:f86c735a7d94c90d8f20987c75b89f36
7.8 kB Download
md5:00b131249c456132eba0dc3d59e23bf1
22.6 kB Download
md5:4f00a108c95ce31b437ee6c4db0be25c
16.6 kB Download
md5:b77a7b514e83c14f845ebf9b2946a600
3.5 kB Download
md5:b77a7b514e83c14f845ebf9b2946a600
3.5 kB Download