Getting Along with Relational Databases

Holmes, Martin

doi:10.5281/zenodo.3451277

Published September 19, 2019 | Version v2

Presentation Open

Getting Along with Relational Databases

Holmes, Martin¹

1. University of Victoria HCMC

Both relational databases and XML have strengths and weaknesses as data storage and modelling systems. Most researchers working with Humanities historical and literary data would argue for the superiority of XML, since it allows unlimited nesting, linking, and complexity. RDB proponents claim superior querying and processing speed, although recent advances in XML languages and tools have eroded that advantage.

Nevertheless, RDBs remain popular, and many researchers seem instinctively to prefer them. Most DH programmers have encountered researchers who know little about databases or data modelling, but are nevertheless convinced that what they need and must have for their project is a database. Databases are somehow compelling and attractive in a way that XML is not. Perhaps the familiarity of tabular data representations is comforting; maybe forcing data into constrained representations seems to constitute mastering it somehow.

So, sometimes against our better judgement or advice, a project may end up with both an RDB and an XML document collection, and programmers must then integrate these distinct forms of data when building project outputs. This presentation discusses the Digital Victorian Periodical Poetry (DVPP) project, where metadata about 15,000 poems from nineteenth-century periodicals is captured in a MySQL database, and periodically exported to create a TEI file for each poem. Many of the poems are then transcribed and encoded. The canonical source of metadata is the RDB, while the canonical source of textual data is the TEI file. Metadata in the TEI files must be periodically updated from the RDB, without disturbing the textual encoding. Changes to the RDB data may result in changes to the id and filename of the related TEI file, so any existing TEI data is migrated to a new file, and the SVN repository must be appropriately updated. All of this is done with XSLT and Ant.

Files

rdb_xml_integration_pres.pdf

Files (1.1 MB)

Name	Size	Download all
rdb_xml_integration_pres.pdf md5:bd9a677937a46539b50e751784941df7	1.1 MB	Preview Download

	All versions	This version
Views	475	281
Downloads	291	178
Data volume	339.6 MB	207.9 MB

Getting Along with Relational Databases

Authors/Creators

Description

Files

rdb_xml_integration_pres.pdf

Files (1.1 MB)