README - TTR Data of the ChildPoeDE Corpus (Lehmann, Heumann, Kuijpers, Lauer & Lüdtke, 2023) All TTR and MATTR values were calculated in R using the quanteda (http://quanteda.io/) package. Poem titles were included if present. doc_id ID combining poem ID and file name. types_per_doc Number of types in the document/poem. token_per_doc Number of tokens in the document/poem. TTR Type-Token Ratio calculated by dividing the number of types by the total number of tokens. (https://quanteda.io/reference/textstat_lexdiv.html). MATTR_w8 Moving-Average Type-Token Ratio with a window of 8. A window of 8 is the smallest possible unit for which a MATTR can be computed for all poems, since the shortest poem is 9 tokens long. The Moving-Average Type-Token Ratio (Covington & McFall, 2010) calculates TTRs for a moving window of tokens from the first to the last token, computing a TTR for each window. The MATTR is the mean of the TTRs of each window. (https://quanteda.io/reference/textstat_lexdiv.html) MATTR_w15 Moving-Average Type-Token Ratio with a window of 15 (25% quartile of stanza length). Cannot be calculated for all poems. Value is NA if not applicable. MATTR_w21 Moving-Average Type-Token Ratio with a window of 21 (median of stanza length). Cannot be calculated for all poems. Value is NA if not applicable. MATTR_w25 Moving-Average Type-Token Ratio with a window of 25 (arithmetic mean of stanza length). Cannot be calculated for all poems. Value is NA if not applicable. MATTR_w29 Moving-Average Type-Token Ratio with a window of 29 (75% quartile of stanza length). Cannot be calculated for all poems. Value is NA if not applicable. Note: The correlation between TTR and MATTR increases with increasing window size (from r=0.52 to r=.68).