Project deliverable Open Access
This document is the first iteration of three annual reports on the state of FAIR in European scientific data by the FAIRsFAIR project. The interpretation of the FAIR data principles and their implications for services are now under intense scrutiny across Europe with multiple possible outcomes. The report is based on studies of public information, especially EOSC infrastructure efforts, and on limited surveying and interviews. The focus has been on understanding the usage of persistent identifiers and semantic interoperability. This study highlights the rapidity of change in technical solutions and wide variation across scientific domains in the uptake. More efforts are needed to guide researchers in best practices.
This report is the first of three of a kind to be produced by the FAIRsFAIR project. This deliverable reviews and documents commonalities and possible gaps regarding semantic interoperability, and the use of metadata and persistent identifiers across infrastructures. Since many landscaping, specification and “FAIRification” activities are ongoing in the EOSC projects and elsewhere, much new information will be added to the later versions. The authors hope to get feedback to enrich and adjust the observations and conclusions made in this document.
FAIR Digital Objects are central to the realisation of FAIR data principles. These objects need to be accompanied by Persistent Identifiers (PIDs) and rich metadata as they sit in a wider FAIR ecosystem comprising of services and infrastructures for FAIR, including identifiers, standards and repositories. The details of the FAIR principles for data, the implementation and implications for services are neither defined nor settled yet. The first suggestions for a more specific definition of a FAIR Digital Object has only recently been presented and will be further tested within the FAIRsFAIR project. Implications of the FAIR data principles for services, repositories and software are being investigated in other FAIRsFAIR tasks. Thus, this report focuses on semantic interoperability as it is a prerequisite for linking and finding data, as well as on the identifiers, which can offer persistence but also need context sensitive solutions. We use the term semantic artefact to overcome the terminological diversity that ironically is a challenge in discussions on this important element of the architecture we need in order to enable semantic interoperability within a FAIR Ecosystem.
Development and implementation of the FAIR data principles should be driven by researcher needs to achieve wide penetration and the potentially significant benefits of FAIR data. The differences within research domains are often bigger than between them. Enforcing standards comes with the risk of making gaps grow between mature and emerging research domains. Community adoption and trust are decisive factors. Enabling services for publishing crosswalks, mappings and semantic application profiles are needed. All these should be registered and published in machine readable formats. A challenge with PID and data type registries is having them to promote reuse of data rather than bulk creation of PIDs. To support interoperability, they should be considered semantic artefacts, curated and reused. The aim should be born-FAIR data, which requires integrated and user friendly solutions throughout the research process and data lifecycle.
By publishing application profiles, preferably in a common registry and in a machine readable format, reuse of semantic artefacts can be promoted, thereby enabling interoperability. Also curated registries like the EOSC Hub, FAIRsharing and re3data.org are important resources for enabling implementation of the FAIR data principles.
We welcome comments and feedback. It is possible to comment here: https://docs.google.com/document/d/1LPMpuDSyIhzYT6S3bG2KPecdDXn-af4bJxzX2r_xLIs/edit?usp=sharing