README - Poemlevel Metadata of the ChildPoeDE Corpus (Lehmann, Heumann, Kuijpers, Lauer & Lüdtke, 2023) Poem_Id* The poem's unique identifier in childPoeDE. All poem ids start with "p_" followed by consecutive numbers, e.g. p_00001. Title_Txt_File* The title of the poem's txt file excluding the id. Identical with the poem's title. If a poem has no title, the file name consists of the first few words of the poem. For poems with identical titles, the first letter of the author's surname is included, e.g. Abendlied_B.txt and Abendlied_C.txt. File names do not contain ä, ö or ü. Title_Poem* The title of the poem. "Kein Titel" for poems without title. Author*** First and last name of the poem's author. GND_Id** The author's unique identifier in the Integrated Authority File (GND). More information on the GND: https://www.dnb.de/EN/Professionell/Standardisierung/GND/gnd_node.html. In some cases the author has no entry within the GND or an unambigous ID attribution was not possible. This is marked with "-". Gender** The author's gender as specified in the GND. m: male f: female u: unknown Birth_Year** The author's year of birth as specified in the GND. Death_Year** The author's year of death as specified in the GND. Anthology*** The name of the anthology from which the poem was taken. Anthology_Count*** Number of anthologies in which the poem appears (out of the seven anthologies we used for childPoeDE). Publisher_Anthology*** The anthology's publisher. Publication_Year_Anthology*** The year the anthology was published. ISBN_Anthology*** The ISBN of the anthology. Has_Title* Data on whether the poem has a title or not. 0: no title 1: with title Special_Layout* Data on whether the poem includes at least one tab character. Tab characters can be used as a proxy for measuring deviations from the standard poem layout (all lines left-aligned). 0: no tab characters 1: at least one tab character Has_Punct* Data on whether one or more of the following punctuation marks appear in the poem: \.,;:!\?–\-\*\(\)\[\]\{\}·…„“ 0: no punctuation marks 1: at least one punctuation mark from the list Has_Uppercase* Data on whether the poem contains uppercase letters. 0: no uppercase letters 1: at least one uppercase letter Has_Lowercase* Data on whether the poem contains lowercase letters. 0: no lowercase letters 1: at least one lowercase letter Has_Titlecase* Data on whether the poem contains words in title case. 0: no words in title case 1: at least one word in title case Has_Sentence_Like_Structure* Data on whether the poem is (most likely) structured in sentences. Based on the assumption that the presence of punctuation marks combined with words in lowercase, uppercase as well as title case indicates that the poem could be structured in sentences. 0: one or more of the variables Has_Punct, Has_Uppercase, Has_Lowercase and Has_Titlecase is 0. 1: all of the variables Has_Punct, Has_Uppercase, Has_Lowercase and Has_Titlecase are 1. Poem_Length_Stanzas* The length of the poem measured in the number of stanzas. Poem_Length_Lines* The length of the poem measured in the number of lines. Poem_Length_Words* The length of the poem measured in the number of words. Onomatopoeia*** Data on whether the poem includes onomatopoeia. 0: no onomatopoeic word 1: at least one onomatopoeic word Average_Poem_Sonority* Average of the sonority scores for each word in the poem (cf. Jacobs, 2017, Stenneken et al, 2005). Maximum: 7, Minimum: 1 Nr_Rhyming_Lines* The number of rhyming lines in the poem determined with rhymetagger (https://github.com/versotym/rhymetagger). Rhyming_Degree* The number of rhyming lines in relation to the total number of lines in the poem. Determined from Rhyme_Structure. Rhyme* Data on whether the poem contains rhyming lines. 0: no rhyming lines 1: at least one rhyming line Rhyme_Structure* Data on which lines of the poem rhyme demtermined with rhymetagger. List of numbers or NONE, each list element represents a line of the poem. NONE indicates that the line rhymes with none of the other lines in the poem. Lines with the same number in the list rhyme. Lex_Den* The poem's lexical density. * Data created with "poemtool.py" or other Python scripts ** Data from the Integrated Authority File (GND) *** Manually added data Format: csv, delimiter: |