Presentation Open Access

How we tripled our encoding speed in the Digital Victorian Periodical Poetry project

Holmes, Martin; Fralick, Kaitlyn; Fukushima, Kailey; Karlson, Sarah

The Digital Victorian Periodical Poetry (DVPP) project is a SSHRC-funded digital humanities
project based at the University of Victoria. With the guidance of principal investigator Dr. Alison
Chapman, the DVPP team is creating a digital index of British periodical poetry from the long
nineteenth century. In addition to uncovering periodical poems, writing descriptive metadata, and
compiling prosopographical research, we are currently using TEI and CSS to encode a statistically-
representative sample of indexed poems, looking for quantitative evidence of literary change over
time. Such an endeavour requires a large, robust dataset covering a range of periodicals throughout
the period.
At the time of writing, there are more than 13,000 poems in the database, and we expect that total
to reach 20,000. Of these, around 2,000 will be encoded, focusing on the decade years (1820, 1830,
1840, and so on).
Journal of the Text Encoding Initiative,
1How we tripled our encoding speed in the Digital Victorian Periodical Project
In this presentation, we will showcase the various strategies and tools we have used to speed up
our encoding process. We combine simple tricks like keyboard shortcuts with more sophisticated
processes to minimize drudgery and increase accuracy. Among the more interesting techniques
are:
• Auto-tagging of a complete poem in lines and linegroups using a Schematron QuickFix;
• Use of advanced CSS selectors in the rendition/@selector attribute to reduce encoding
clutter in the poem itself;

A keyboard shortcut to tag rhymes which detects whether the tagged text is a masculine
or feminine rhyme and provides the appropriate attribute value;

Auto-detection of cases where a new line-end rhymes with a previously-encoded rhyme,
and should, therefore, be labelled to match it, leveraging our growing dataset of nearly
30,000 rhymes;

Instant access to to a rendering of the poem which provides a visualization of the rhyme
structure, auto-detection of anaphora, epistrophe and other refrain-like forms, and other
diagnostic feedback.

Files (2.2 MB)
Name Size
encoding_speed.pdf
md5:444e60f0b72bf7ffcdb58bf0d1d7adc5
2.2 MB Download
127
28
views
downloads
All versions This version
Views 127127
Downloads 2828
Data volume 62.6 MB62.6 MB
Unique views 115115
Unique downloads 2626

Share

Cite as