Published December 18, 2019 | Version final
Thesis Open

Enabling the data FAIRness of version control systems

  • 1. University of Amsterdam

Contributors

  • 1. University of Amsterdam
  • 2. Grasple

Description

 Many kinds of different scientific data are being produced every day by research institutes across the

globe. Scientists are interested in using this data, but often have difficulties when trying to obtain

access to data that has been created and is stored by external organizations, due to incompatible data

management standards. The Findability, Accessibility, Interoperability, Re-usability (FAIR) principles

are guiding principles for scientific data management and stewardship, which have been developed to

facilitate knowledge discovery by introducing common standards for human and machine interaction with

data, utilizing Persistent Identifiers (PIDs) and metadata. Several technologies and services have been

introduced which leverage these principles. However, all aforementioned standards, technologies, and

services are intended for static data and do not provide adequate support for dynamic and evolutionary

data, e.g. software source code, which is often managed by Version Control Systems (VCSs) such as Git

and Subversion. This research investigated the current approaches to managing persistently identified

data through VCSs and found them to be lacking in diversity of supported VCSs and persistent publishing

systems, and proposed a novel system which allows for direct publishing of repositories from multiple

VCSs to multiple, external publishing systems through a web-accessed interface. This initial idea has

also been published as a poster in the 2019 eScience Proceedings [1], which originated from an industry

problem posed by Grasple [2]. Additionally, at the end of the thesis, several assertions and conclusions

about the state of the art of persistent publishing of evolutionary data, most notably software source

code, are made which detail important problems that need additional solutions.

Files

SE_Master_Project (7).pdf

Files (1.9 MB)

Name Size Download all
md5:c67b42f224a686604f8b54958842b18e
1.9 MB Preview Download