Published September 19, 2019 | Version v1
Presentation Open

How we tripled our encoding speed in the Digital Victorian Periodical Poetry project

  • 1. University of Victoria HCMC
  • 2. University of Victoria

Description

The Digital Victorian Periodical Poetry (DVPP) project is a SSHRC-funded digital humanities
project based at the University of Victoria. With the guidance of principal investigator Dr. Alison
Chapman, the DVPP team is creating a digital index of British periodical poetry from the long
nineteenth century. In addition to uncovering periodical poems, writing descriptive metadata, and
compiling prosopographical research, we are currently using TEI and CSS to encode a statistically-
representative sample of indexed poems, looking for quantitative evidence of literary change over
time. Such an endeavour requires a large, robust dataset covering a range of periodicals throughout
the period.
At the time of writing, there are more than 13,000 poems in the database, and we expect that total
to reach 20,000. Of these, around 2,000 will be encoded, focusing on the decade years (1820, 1830,
1840, and so on).
Journal of the Text Encoding Initiative,
1How we tripled our encoding speed in the Digital Victorian Periodical Project
In this presentation, we will showcase the various strategies and tools we have used to speed up
our encoding process. We combine simple tricks like keyboard shortcuts with more sophisticated
processes to minimize drudgery and increase accuracy. Among the more interesting techniques
are:
• Auto-tagging of a complete poem in lines and linegroups using a Schematron QuickFix;
• Use of advanced CSS selectors in the rendition/@selector attribute to reduce encoding
clutter in the poem itself;

A keyboard shortcut to tag rhymes which detects whether the tagged text is a masculine
or feminine rhyme and provides the appropriate attribute value;

Auto-detection of cases where a new line-end rhymes with a previously-encoded rhyme,
and should, therefore, be labelled to match it, leveraging our growing dataset of nearly
30,000 rhymes;

Instant access to to a rendering of the poem which provides a visualization of the rhyme
structure, auto-detection of anaphora, epistrophe and other refrain-like forms, and other
diagnostic feedback.

Files

encoding_speed.pdf

Files (2.2 MB)

Name Size Download all
md5:444e60f0b72bf7ffcdb58bf0d1d7adc5
2.2 MB Preview Download