Presentations: AI & Software Preservation
Authors/Creators
Description
2 presentations:
Large Language Models for Software Mention Extraction
David Pride (1), Matteo Guenci (2), Martin Dočekal (3), Silvio Peroni (2), Petr Knoth (1)
(1) CORE, KMi, The Open University, United Kingdom; (2) University of Bologna; (3) Brno University of Technology
A large proportion of scientific studies now rely on software and data as an integral component of the research process. Significant time and resources are committed to the development of research software yet, too often, these valuable assets lie languishing, hidden in the original research paper that presented them. Ensuring the availability of software and data, and directly linking these assets to the research that first introduced them, is a key component in addressing current problems faced by many scientists when attempting to replicate earlier studies. There have been a number of efforts in recent years to develop methodologies for the extraction and classification of software mentions found in full text scholarly documents. In this presentation, we will discuss how large language models can match current SotA approaches to the problem utilising zero-shot methods that require no pre-training.
From Proof of Concept to Practice: A Repository for Preserving and Managing Running Applications
Raman Ganguly, University of Vienna, Austria
The long-term preservation of research outputs increasingly depends on software preservation, as data in many disciplines are inseparable from the computational tools used to generate and interpret them. While repositories have established robust practices for preserving data and publications, software remains fragile due to short lifecycles, complex dependencies, and rapidly evolving technological environments. Archiving source code alone is often not enough.
This presentation reports on the progress of a repository-based approach to preserving running applications. Building on an earlier proof of concept, the project has evolved into an operational repository service. Recent developments include improved container ingestion workflows, clearer separation between preservation and execution layers, standardized packaging practices, and controlled, sandboxed execution environments that lower technical barriers while addressing security concerns.
In parallel, the project has focused on collaboration with researchers, particularly in the digital humanities, to better understand software development practices and to develop guidance, documentation, and support structures for long-term preservation. Rather than proposing a single solution, this work outlines a practical path forward and highlights open questions regarding standards, interoperability, and FAIR principles for research software, contributing to ongoing community discussions on the preservation of executable software.
Files
437.mp4
Additional details
Related works
- Has part
- 10.5281/zenodo.20789120 (DOI)
- 10.5281/zenodo.20788974 (DOI)