Presentation Open Access

How we tripled our encoding speed in the Digital Victorian Periodical Poetry project

Holmes, Martin; Fralick, Kaitlyn; Fukushima, Kailey; Karlson, Sarah


DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd">
  <identifier identifierType="DOI">10.5281/zenodo.3449241</identifier>
  <creators>
    <creator>
      <creatorName>Holmes, Martin</creatorName>
      <givenName>Martin</givenName>
      <familyName>Holmes</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0002-3944-1116</nameIdentifier>
      <affiliation>University of Victoria HCMC</affiliation>
    </creator>
    <creator>
      <creatorName>Fralick, Kaitlyn</creatorName>
      <givenName>Kaitlyn</givenName>
      <familyName>Fralick</familyName>
      <affiliation>University of Victoria</affiliation>
    </creator>
    <creator>
      <creatorName>Fukushima, Kailey</creatorName>
      <givenName>Kailey</givenName>
      <familyName>Fukushima</familyName>
      <affiliation>University of Victoria</affiliation>
    </creator>
    <creator>
      <creatorName>Karlson, Sarah</creatorName>
      <givenName>Sarah</givenName>
      <familyName>Karlson</familyName>
      <affiliation>University of Victoria</affiliation>
    </creator>
  </creators>
  <titles>
    <title>How we tripled our encoding speed in the Digital Victorian Periodical Poetry project</title>
  </titles>
  <publisher>Zenodo</publisher>
  <publicationYear>2019</publicationYear>
  <dates>
    <date dateType="Issued">2019-09-19</date>
  </dates>
  <language>en</language>
  <resourceType resourceTypeGeneral="Text">Presentation</resourceType>
  <alternateIdentifiers>
    <alternateIdentifier alternateIdentifierType="url">https://zenodo.org/record/3449241</alternateIdentifier>
  </alternateIdentifiers>
  <relatedIdentifiers>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.3449240</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsPartOf">https://zenodo.org/communities/tei2019</relatedIdentifier>
  </relatedIdentifiers>
  <rightsList>
    <rights rightsURI="https://creativecommons.org/licenses/by/4.0/legalcode">Creative Commons Attribution 4.0 International</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
  </rightsList>
  <descriptions>
    <description descriptionType="Abstract">&lt;p&gt;The Digital Victorian Periodical Poetry (DVPP) project is a SSHRC-funded digital humanities&lt;br&gt;
project based at the University of Victoria. With the guidance of principal investigator Dr. Alison&lt;br&gt;
Chapman, the DVPP team is creating a digital index of British periodical poetry from the long&lt;br&gt;
nineteenth century. In addition to uncovering periodical poems, writing descriptive metadata, and&lt;br&gt;
compiling prosopographical research, we are currently using TEI and CSS to encode a statistically-&lt;br&gt;
representative sample of indexed poems, looking for quantitative evidence of literary change over&lt;br&gt;
time. Such an endeavour requires a large, robust dataset covering a range of periodicals throughout&lt;br&gt;
the period.&lt;br&gt;
At the time of writing, there are more than 13,000 poems in the database, and we expect that total&lt;br&gt;
to reach 20,000. Of these, around 2,000 will be encoded, focusing on the decade years (1820, 1830,&lt;br&gt;
1840, and so on).&lt;br&gt;
Journal of the Text Encoding Initiative,&lt;br&gt;
1How we tripled our encoding speed in the Digital Victorian Periodical Project&lt;br&gt;
In this presentation, we will showcase the various strategies and tools we have used to speed up&lt;br&gt;
our encoding process. We combine simple tricks like keyboard shortcuts with more sophisticated&lt;br&gt;
processes to minimize drudgery and increase accuracy. Among the more interesting techniques&lt;br&gt;
are:&lt;br&gt;
&amp;bull; Auto-tagging of a complete poem in lines and linegroups using a Schematron QuickFix;&lt;br&gt;
&amp;bull; Use of advanced CSS selectors in the rendition/@selector attribute to reduce encoding&lt;br&gt;
clutter in the poem itself;&lt;br&gt;
&amp;bull;&lt;br&gt;
A keyboard shortcut to tag rhymes which detects whether the tagged text is a masculine&lt;br&gt;
or feminine rhyme and provides the appropriate attribute value;&lt;br&gt;
&amp;bull;&lt;br&gt;
Auto-detection of cases where a new line-end rhymes with a previously-encoded rhyme,&lt;br&gt;
and should, therefore, be labelled to match it, leveraging our growing dataset of nearly&lt;br&gt;
30,000 rhymes;&lt;br&gt;
&amp;bull;&lt;br&gt;
Instant access to to a rendering of the poem which provides a visualization of the rhyme&lt;br&gt;
structure, auto-detection of anaphora, epistrophe and other refrain-like forms, and other&lt;br&gt;
diagnostic feedback.&lt;/p&gt;</description>
  </descriptions>
</resource>
539
66
views
downloads
All versions This version
Views 539539
Downloads 6666
Data volume 147.6 MB147.6 MB
Unique views 524524
Unique downloads 6464

Share

Cite as