Published January 10, 2024 | Version v1
Conference paper Open

Distributional criteria for identifying formulas in Finnic oral poetry

  • 1. University of Helsinki
  • 2. ROR icon Finnish Literature Society
  • 3. Estonian Literary Museum

Description

The digitised collections of Finnic oral poetry present exceptional material for studying oral tradition. According to the oral-formulaic theory, in oral tradition singers recreate the poems during performance relying on a store of composition units of various lengths. The formula was defined by Lord as "a group of words, employed repeatedly under the same metric conditions to express a certain idea". In the framework of oral-formulaic theory the concept of formula has got a wider meaning of recurrent units of various lengths.

Recently, we have been applying a clustering based on cosine similarity of character bigrams to identify equivalent poetic lines across dialectal and orthographic variation. Building up on that, in this paper we explore quantitative criteria for identifying formulas, starting on the level of a single line.

The first obvious criterion is frequency. Furthermore, we introduce a measure of “stereotypy” as entropy of the distribution of line over high-level context, which is approximated by automatic text clustering. Finally, we apply a statistical co-occurrence metric (log-likelihood ratio) to identify lines frequently occurring together. We examine the capability of the method to identify parallel lines and multi-line formulas.

Files

paper.pdf

Files (1.1 MB)

Name Size Download all
md5:020b02042407d8f6cb1bcd0eac91314d
1.1 MB Preview Download

Additional details

Funding

Research Council of Finland
Formulaic intertextuality, thematic networks and poetic variation across regional cultures of Finnic oral poetry / Consortium: FILTER 333139