Published September 24, 2019 | Version v1
Presentation Open

Case Study TEI Customisation: a Restricted TEI Format for Edition Open Access

  • 1. Max Planck Institute for the History of Science

Description

Edition Open Access (EOA) (http://edition-open-access.de) is a publishing platform for scholarly monographs and edited volumes that provides authors and publishers with the means to distribute their publications in a variety of formats, e.g. online as HTML, downloadable as PDF and EPUB, or as a printed book. In the new version of the platform, which is currently under development, TEI-XML is used as the central file format. Using the ODD format we are able to specify exactly which parts of TEI- XML we want to use. We can then use the existing TEI infrastructure to create documentation and schemata for our customized TEI format.

For small simple changes creating an ODD customization is simple and feasible: There are some nice tutorials and introductory material and adding or removing some elements or modules is easily being done by adding a few lines into an ODD file. However, because of the subsequent workflow steps, the EOA publishing infrastructure expects TEI documents to have a very specific structure. Using ODD as a central format for describing and enforcing this structure seems to reveal certain shortcomings:

  • the semantics of ODD are sometimes slightly vague and hard to figure out from the documentation (How do I know what my ODD defines?)

  • some tools in the TEI infrastructure are slightly incomplete or contain bugs

  • in ODD, exactly one “content model” for every XML element can be given. TEI provides

    some general purpose elements (e.g. <div> or <p>) which can appear in many different

    1. http://edition-open-access.de, this project is currently funded by the German Federal Ministry of Education and Research, grant number 16OA061.

    1/3

 

contexts. In many cases we want to restrict their content in different ways depending on their context, e.g. their position in the XML tree or the value of their @type attribute (e.g. div[@type = 'chapter'], div[@type = 'section'], ...). The only way to achieve this with the current state of ODD is to add Schematron (http://schematron.com) rules, which are cumbersome and repetitive to write manually.

In our paper, we give a roadmap of how to customize TEI with ODD considering the specifics of Edition Open Access. We will present a few stategies and tools that address some of the problems mentioned above, e.g. an experimental script to automatically generate an ODD including complex Schematron rules from a Relax NG schema (https://relaxng.org). We hope to initiate a discussion about possible improvements of the ODD format and the TEI infrastructure.

Our roadmap includes the following waypoints:

  1. Creating a strict Relax NG Compact schema that covers all the textual phenomena of a scholarly publication in this domain.

  2. Learning, understanding and using ODD (TEI modules, the class system, complex restrictions with Schematron).

  3. Discovering the problems and limitations of ODD.

  4. Finding a solution for EOA that involves the development of tools to generate an ODD file based on the manually create RNC schema in step 1. Example files and scripts are available at https://github.molgen.mpg.de/EditionOpenAccess/eoa-publication-model.

  5. Discussion: Possible improvements on ODD and TEI.

 

This project is currently funded by the German Federal Ministry of Education and Research, grant number 16OA061.

Files

sgfroererkthoden_eoausecaseodd.pdf

Files (1.0 MB)

Name Size Download all
md5:23aabdb3db350f8e336ef62172ec3138
1.0 MB Preview Download