Improving access and research infrastructure for the internet's images
Creators
Description
Access to image-based resources is fundamental to research and the transmission of cultural knowledge. Digital access offers the potential for scholars to employ heritage collections internationally via the internet. However, most of the internet’s image-based resources have been locked up in silos, and access at resolutions useful for research has been restricted to bespoke, locally-built applications. The International Image Interoperability Framework (IIIF) can solve this problem using Application Program Interfaces (APIs) which allow images and metadata held in different digital collections to be accessed in a standardised format.
In this EU-funded MUYA-IIIF project 694612 Proof of Concept project (the MUYA-IIIF PoC) the School of African and Oriental Studies (SOAS) has addressed the problem of tools and infrastructure to realize the potential of IIIF in practical workflows for researchers in the social sciences and humanities. Heretofore the large number of organizations worldwide now providing access to their image-based resources via IIIF have only supported viewing, and not the routine use of research processes such as standards-based scientific annotation. The MUYA-IIIF PoC has worked with the British Library (BL) and the hasdai Partnership of CERN and Data Futures GmbH to build and employ a general-purpose IIIF annotation workflow, extending existing research conducted under earlier The Multimedia Yasna ERC advanced grant. Significantly, not only have problems of annotation workflows been addressed by the MUYA-IIIF PoC, but also the creation of new, reusable primary data resources from research employing annotation, which can be preserved using the W3C's Web Annotation Data Model (WADM) standards and state-of-the art InvenioRDM repository technology.
Specifically, MUYA-IIIF has annotated textual structure in key Avestan manuscripts from multiple collections, including from the British Library, to connect the digitized manuscript imagery with structured transcriptions of the text it bears, enabling analysis and searching.
While many institutions internationally now provide IIIF data resources based on their manuscript collections, very few of these are yet compatible with standards-based annotation. The Oxford MA in digital scholarship found, as recently as Fall 2022, that it needed to convert libraries' IIIF resources before being able to annotate them. In contrast, MUYA-IIIF has produced a new WADM-compliant IIIF service, which can now be freely annotated by scholars, and it has also created annotations of all of the stanzas of the Zoroastrian Yasna ceremony. In turn, this has permitted reuse of existing research investment using Text Encoding Initiative (TEI) analysis of the Yasna. As a result a significant speed-up has been achieved in developing comprehensive interactive transcription of the Yasna manuscripts.
The second part of the MUYA-IIIF project has addressed sustainability and reuse of this new digital collection: in contrast, many data resources in the Humanities and in cultural heritage become vulnerable to technology obsolescence. In particular, WADM annotations are stand-off in nature—they are stored separately from the digitized manuscript imagery and demand new approaches for effective preservation and accessibility for the wider research community. MUYA-IIIF has therefore worked with the hasdai partnership to gain access to new repository technology developed in the InvenioRDM consortium, which supports annotation. InvenioRDM is the software platform on which the upgrade of the European Commission's OpenAIRE trusted Zenodo repository is based.
To support such long-term access and reuse, the project's outputs comprise four components, which together form a sustainable data resource on which not only SOAS but also the external research community can build.
- the MUYA InvenioRDM corpus repository provides metadata linked to the British Library manuscript record, and a IIIF viewer for the BL Arundel Yasna manuscript imagery.
- annotations produced in the MUYA-IIIF project are available for download as a WADM. dataset—these annotations are linked to the manuscript imagery via Persistent IDentifiers (PIDs).
- the IIIF service for the manuscript is available for use by external IIIF applications and researchers' tools.
- a Zenodo record, linked to the corpus repository, provides the Yasna annotation collection dataset, and this is discoverable by international libraries via the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH).
The new MUYA InvenioRDM corpus repository is supported in the long-term through the hasdai Partnership, and the Zenodo record is supported by the EC's OpenAIRE program. In this way MUYA-IIIF has created a new sustainability benchmark for digital research investments using scientific annotation, by assembling standards-based infrastructures and making very long-term costs of operation of complex data resources forecastable in concrete terms.
Implementation of the project
The project was divided into work-packages, reflecting three main activities:
- creating a IIIF service for the British Library Arundel manuscript of the Yasna ceremony and building an annotation workflow
- annotating the chapters and stanzas of the Yasna and connecting them with transcriptions and analyses already accomplished in other research activities
- developing a corpus repository for Zoroastrian manuscripts to support both accessibility for the research community and also preservation of the research investment in digitization and annotation, as well as connecting other existing research outcomes
The annotation workflow for the project employed the freizo anəstor platform developed by Data Futures GmbH and employed by institutions in Europe and the U.S. including CERN, Heidelberg, Oxford and Notre Dame, and this was configured for work on the Yasna manuscript. Anəstor can generate multiple versions of Open Annotation Data Model and Web Annotation Data Model (WADM) annotations, to address differences between existing, current and future standards-based WADM research environments, and it is being integrated with the Zenodo global catch-all repository of OpenAIRE.
The annotation workflow provided security for SOAS scholars through ORCID authentication, so that their work was protected from unauthorized modification, and also allowed their contributions to be tracked and credited for citation. In addition the workflow exported annotation collections in a preservable form for efficient access by the research community and for preservation.
Developing an InvenioRDM corpus repository for the MUYA-IIIF project enabled the digital version of the manuscript, together with the British Library metadata, to be presented without restrictions on the internet, and for the annotation collections to form a foundation for future research via down-loadable JSON datasets (JSON is technology-agnostic and can be employed by a wide range of current and future research software applications).
SOAS now plans to extend the MUYA-IIIF repository with additional manuscripts based on fieldwork in India and Iran and through collaborations with other institutions worldwide. Long-term hosting of this data resource by the hasdai Partnership is already organized for 10 years and new developments such as the Oxford Common File Layout (OCFL) are enabling both very long-term preservation using LTO tape libraries and also cross repository interoperability for resilience as technologies continue to evolve.
Files
MUYA_WADM_2308619.json
Files
(2.1 MB)
Name | Size | Download all |
---|---|---|
md5:46b565f7a81bebd562ecc6ecaa6b1a0e
|
2.1 MB | Preview Download |