Published July 8, 2014 | Version v1
Conference paper Open

Progress through Regression. Modeling Style across Genre in French Classical Theater

  • 1. University of Trier, Germany
  • 2. Indiana University Bloomington

Description

Considerable scholarship in stylometry has focused on authorship attribution. Such work is based on the assumption that rates of high frequency "function" words (in contrast to "content" words) are reliable clues to authorship and are largely independent of factors like theme or genre. More recently, focus seems to have moved beyond the most frequent words to involve all vocabulary appearing in a corpus. As many of these words vary strongly by context, factors like theme, genre, literary period or literary form have received greater attention.
This paper makes two contributions. First, we test the hypothesis that authorial style depends on genre and find that this is indeed the case, even when only considering the most frequent words. Second, in light of this result, we argue that adding additional features such as genre to a familiar model of authorship attribution offers a useful and novel way to investigate how authors' writing varies depending on context. We demonstrate how stylistic analysis making use of more articulate probabilistic models might move beyond established but limited models such as principal component analysis and distance-based clustering and achieve a better fit between model and hypothesis.

Files

Schoech-Riddell_2014_Progress-Through-Regression-DH2014.pdf

Files (1.6 MB)