Archiving for the Future Past - Multimodality and AI - Challenges and Opportunities
Creators
Description
This paper discusses how to enhance existing digital archival solutions with new AI-based approaches. We take as an example the creation of multimodal representations [1] of performing arts around a newly emerging repository hosted by ShareMusic, a Swedish Knowledge Centre for Artistic Development and Inclusion.
Traces of performing arts make a prime example for embracing new technological challenges when it comes to archiving in the present for the coming past. [2] The traces usually represent complex digital objects, often combining text, image, video, 3D object representations and so on. [3] To encapture their multimodality features as well as building multimodality (use of various senses) into retrieving them adds another layer of complexity to the digital preservation.
In this paper we present the different phases when it comes to the design of a repository fit for documentation around an inclusive performing arts with an interface providing inclusive access. [4]
Technologically, open source developments like the Dataverse project [5] and tools to foster local implementation of mature archival solutions [6] form the solid fundament. Leading for the design process are knowledge organisation workflows which involve human experts [7] to create a knowledge base for arts and inclusion. [8]
At the core of this paper we demonstrate how innovative local AI solutions (Ghostwriter [9]) can be used to enhance the annotation of datasets next to enhancing their accessibility via various web interface frames (see Figure in pdf). In particular we zoom into the role of Monomodal Transformative AI (MTA) and Multimodal Cognitive AI (MCAI). The first (MTA) refers to a set of technologies that convert a single-source input into multiple accessible formats. For example, text can be transformed into audio or haptic representations, enabling broader accessibility for individuals with different needs.The second (MCAI) is a class of AI systems trained on multiple modalities to generate context-aware outputs by leveraging multimodal knowledge. These approaches are still in an early stage. We reflect how they can be developed further alongside the expansion of multimodal data stores, which provide the necessary corpus for effective training.
On a metalevel, this paper discusses how such innovative explorations, done in the context of EC and national funded projects (SSHOC.EU, MuseIT, SSHOC.nl) can be transported to mature repository services. Content-wise the emerging ShareMusic repository and the established Data Stations at DANS-KNAW share the fact that their collection material by nature is heterogeneous. It encompasses a spectrum from scientific documentation about humanities and arts scholarship as well as source material (of multimodal nature). A shared feature is also that ‘data sets’ are often produced by smaller communities either in academia and/or in society, sometimes also produced by vulnerable groups, and that the resulting traces can easily become ‘‘endangered”. Adhering to the expertise function of DARIAH we exchange experiences on how to repurpose existing technological solutions and to enable division of labour via API service networks. This way costly tailored niche applications can be avoided, and the sustainability of research infrastructures for the humanities can be enhanced.
Acknowledgements
Part of this work is based on the SSHOC.EU project (Grant agreement ID: 823782); MuseIT (Grant agreement ID: 101061441); SSHOC-NL (financed by the Dutch Research Council (NWO) Large-scale Research Infrastructure Grant).
References
[1] Kress, G. (2010). Multimodality: A Social Semiotic Approach to Contemporary Communication. New York: Routledge. p. 79. ISBN 978-0415320603.
[2] Trevarthen C., Gratier M., Osborne N. (2014). The human nature of culture and education. WIREs, Wiley, Hoboken NJ
[3] Giaretta, D. (2011) Advanced Digital Preservation. Springer DOI 10.1007/978-3-642-16809-3
[4] Eardley, A. F., Mineiro, C., Neves, J., & Ride, P. (2016). Redefining Access: Embracing multimodality, memorability and shared experience in Museums. Curator: The Museum Journal, 59(3), 263–286. https://doi.org/10.1111/cura.12163
[5] Crosas M. Cloud Dataverse: A Data Repository Platform for the Cloud. CIO Review. Open Stack. 2017. https://openstack.cioreview.com/cxoinsight/cloud-dataverse-a-data-repository-platform-for-the-cloud-nid-24199-cid-120.html
[6] Wittenberg, M., Tykhonov, V., Indarto, E., Steinhoff, W., Veld, L. H. I. ., Kasberger, S., Conzett, P., Concordia, C., Kiraly, P., & Parkoła, T. (2022). D5.5 'Archive in a Box' repository software and proof of concept of centralised installation in the cloud. Zenodo. https://doi.org/10.5281/zenodo.6676391
[7] Smiraglia, R.P. 2015 Domain Analysis for Knowledge Organization. Tools for Ontology Extraction. Elsevier
[8] Johansson, M., Tykhonov, V., Alexandersson, S., Ferguson, K., Hanlon, J., Scharnhorst, A., and Osborne, N. 2025 A Knowledge Base for Arts and Inclusion - The Dataverse data archival platform as a knowledge base management system enabling multimodal accessibility. Paper accepted to HumanComputerInteraction Conference 2025.
[9] Tykhonov, V., Rijsselberg, F. van ., & Endarto, E. (2024). The Next Generation of Data Management with Artificial Intelligence. Presentation, ODISSEI conference 2024, Utrecht, Netherlands. Zenodo. https://doi.org/10.5281/zenodo.14507120
Files
Archiving for the Future Past_2025.pdf
Files
(2.2 MB)
Name | Size | Download all |
---|---|---|
md5:af49aedf834824117b7385b224f03a43
|
2.2 MB | Preview Download |
Additional details
Related works
- Is documented by
- Conference paper: 10.1007/978-3-031-93064-5_19 (DOI)