From facsimile to online representation. The Centre for Digital Editions in Darmstadt. An Introduction
Creators
- 1. University and State Library Darmstadt
Description
Poster presented at the TEI Conference and Members' Meeting 2022 at Newcastle.
The Centre for Digital Editions in Darmstadt (CEiD) covers all aspects of preparing texts for digital scholarly editions from planning to publication. It not only processes the library's own holdings, but also partners with external institutions.
Workflow
After applying both automatic and manual methods for text recognition (OCR/HTR) the output is used as a starting point for the realisation of the digital edition as an online publication. In addition, a variety of transformation tools is used to convert texts from different formats such as XML, JSON, WORD-DOCX or PDF into TEI-based formats (TEI Consortium 2022), thus substantially enabling uniformity across different projects. These texts can be annotated and enriched with metadata. Furthermore, entities can be marked up, which are managed in a central index file. This workflow is not static, but can be adapted according to the needs of the project. Scholars and developers alike can benefit from this workflow which centers on translating various data formats into TEI.
Framework
The XML files are stored in eXist-db (eXist Solutions 2022) and presented in various user-friendly ways with the help of the framework wdbplus (Kampkaspar 2018), which is designed according to the needs of large institutions with diverse corpora, such as a university library. By default, the transcribed text and the corresponding scan presented side by side. Additionally, different forms of presentation are available so that the special needs of individual projects can be considered. Further advantages of wdbplus are various REST-APIs, which not only allow the retrieval of individual texts, but also of metadata and further information. Full-text search is realised at project level as well as across projects.
CEiD's portfolio includes several projects in which a multitude of texts are processed. The source material ranges from early modern prints and manuscripts to more recent texts and includes early constitutional texts, religious peace agreements, newspapers and handwritten love letters.
Bibliography
- eXist Solutions (2022): eXist-db [Online]. Available at: https://exist-db.org (Accessed: 20 June 2022)
- Kampkaspar, Dario (2018): W. Digitale Bibliothek (wdbplus). Available at: https://github.com/dariok/wdbplus (Accessed: 20 June 2022)
- TEI Consortium (2022) TEI P5: Guidelines of Electronic Text Encoding and Interchange. [Version 4.4.0]. Available at: https://tei-c.org/guidelines/p5/ (Accessed: 20 June 2022)
Files
PosterCEiD_digital.pdf
Files
(289.1 kB)
Name | Size | Download all |
---|---|---|
md5:cdf1296b2e4288a559979f0369081f7e
|
289.1 kB | Preview Download |