Relationship Between Poetic Meter and Meaning in Accentual-Syllabic Verse (data and replication code)

doi:10.5281/zenodo.4926549

Published June 11, 2021 | Version v1.0

Dataset Open

Relationship Between Poetic Meter and Meaning in Accentual-Syllabic Verse (data and replication code)

1. Institute of Polish Language, Krakow
2. Institute of Czech Literature, Prague
3. Leiden University

main.py: script to train both lda and word2vec models
main.ipynb: Jupyter Notebook containing all the analyses reported in the paper
pos.ipynb: clustering based on frequencies of parts-of-speech
corpora: contains original data for Czech, English, and Dutch poetry in JSON (proprietary German and Russian not included)

{   <= Each item in the following lists corresponds to particular poem and holds: 
    'words':    []      <= list of lemmata found in the poem
    'pos_tags': []      <= their POS-tags (Positional Morphological Tags for Czech, 
                           MyStem for Russian, TreeTagger tagsets for other corpora)
    'meters':   [[]]    <= list of meters found in poem
    'years':    []      <= year when poem published (year when author born in case of English)
    'n_words':  []      <= number of words
    'n_lines':  []      <= number of lines
    'authors':  []      <= author of the poem
    'titles':   []      <= title of the poem
    'schemes':  []      <= line-ending schemes
}

dicts: contains Gensim dictionary files for all 5 corpora

fig: contains all resulting figures

json > metadata: contains all metadata on poems in particular corpora

{   <= Each item in the following lists corresponds to particular poem and holds: 
    'meters':  [[]]     <= list of meters found in poem
    'years':   []       <= year when poem published (year when author born in case of English)
    'n_words': []       <= number of words
    'n_lines': []       <= number of lines
    'authors': []       <= author of the poem
    'titles':  []       <= title of the poem
}

json > topics: contains topic probabilities in particular poems

[   <= each item corresponds to particular poem and comprise 100-dimensional dict
    {
        'topic title': its probability in poem
    }
]

json > pos: contains POS relative frequencies in particular poems

[   <= each item corresponds to particular poem
    {
        'POS': its frequency
    }
]

json > w2v: contains mapping of lemmata and their neighbours in word2vec models

models: contains pretrained lda and word2vec models (Gensim)

Files

semanticHalo.zip

Files (10.0 GB)

Name	Size	Download all
semanticHalo.zip md5:61d271198501726944a298b9c12f510f	10.0 GB	Preview Download

	All versions	This version
Views	391	194
Downloads	13	10
Data volume	149.3 GB	119.5 GB

Relationship Between Poetic Meter and Meaning in Accentual-Syllabic Verse (data and replication code)

Creators

Description

Files

semanticHalo.zip

Files (10.0 GB)