Distributional criteria for identifying formulas in Finnic oral poetry
Authors/Creators
Description
The digitised collections of Finnic oral poetry present exceptional material for studying oral tradition. According to the oral-formulaic theory, in oral tradition singers recreate the poems during performance relying on a store of composition units of various lengths. The formula was defined by Lord as "a group of words, employed repeatedly under the same metric conditions to express a certain idea". In the framework of oral-formulaic theory the concept of formula has got a wider meaning of recurrent units of various lengths.
Recently, we have been applying a clustering based on cosine similarity of character bigrams to identify equivalent poetic lines across dialectal and orthographic variation. Building up on that, in this paper we explore quantitative criteria for identifying formulas, starting on the level of a single line.
The first obvious criterion is frequency. Furthermore, we introduce a measure of “stereotypy” as entropy of the distribution of line over high-level context, which is approximated by automatic text clustering. Finally, we apply a statistical co-occurrence metric (log-likelihood ratio) to identify lines frequently occurring together. We examine the capability of the method to identify parallel lines and multi-line formulas.
Files
paper.pdf
Files
(1.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:020b02042407d8f6cb1bcd0eac91314d
|
1.1 MB | Preview Download |
Additional details
Funding
- Research Council of Finland
- Formulaic intertextuality, thematic networks and poetic variation across regional cultures of Finnic oral poetry / Consortium: FILTER 333139