Published August 4, 2025 | Version v1
Conference paper Open

InvenioRDM in NFDI: Advancing FAIR Data Across Domains

  • 1. ZBW - Leibniz Information Centre for Economics
  • 2. University of Freiburg
  • 3. Leibniz Institute for the German Language
  • 1. Nationale Forschungsdateninfrastruktur (NFDI) e.V.
  • 2. University of Amsterdam

Description

As part of the BERD, Text+, and DataPLANT consortia, we highlight the strategic value of adopting the open-source InvenioRDM framework for building research data platforms across the NFDI. Our platforms, BERD1, ARChive, and FreiData demonstrate how InvenioRDM enables adaptive, domain-specific solutions that follow the FAIR principles. InvenioRDM's architecture supports customizable metadata schemas, robust access control, persistent identifiers, and extensible submission workflows. These components are essential for building trusted repositories that meet evolving scientific needs. InvenioRDM integrates DataCite's API for DOI assignment and automatic profile updates. When researchers authenticate their ORCID, publication details can be synced to their ORCID profiles. OAI-PMH and REST APIs allow research systems to harvest and reference dataset metadata. Integration with re3data and similar aggregators is planned. Furthermore, InvenioRDM supports S3 storage, providing scalable infrastructure and aligning with OpenAIRE and EOSC. In BERD@NFDI, the BERD data portal is developed on InvenioRDM to manage and share business and economics research data. The platform extends InvenioRDM with a dedicated data marketplace, enabling the discovery and exchange of diverse datasets, and incorporates community workflows for data validation. In DataPLANT's science gateway, the DataHUB, the responsibility for providing permanent references to published versions of ARCs—including various forms of annotated research data, workflows, and results—is handled by InvenioRDM. It offers all the necessary interfaces to integrate with other DataHUB components and external services. All ARC publications initiated from the DataHUB are automatically submitted for review within the appropriate community and must be approved by a designated data steward or curator. This review step is essential to prevent accidental or erroneous publications, which could otherwise occur through user error or unintended automation, potentially resulting in unnecessary DOI assignments. Within Text+, the Leibniz-Institute for the German Language (IDS) is one of several centres that use Invenio as a digital archiving solution. The IDS repository preserves written and spoken language data, with a focus on German-language resources but also incorporates content in other languages. Using a custom Invenio metadata field, it is able to provide rich metadata using the Component Metadata Infrastructure, which is an ISO standard. Besides its implementation in various data centres, it was adopted by the European Research Infrastructure Consortium CLARIN. By using the custom metadata field for CMDI metadata, our Invenio-based repository is also capable of delivering CMDI metadata via OAI-PMH, besides Dublin Core and DataCite, which can then be harvested by specialized data cataloguing and search applications such as the Virtual Language Observatory or the Text+ Registry. Successful and sustainable operation requires close coordination among stakeholders at the various levels and from the different stakeholders. Thus, we advocate for the NFDI to promote Invenio as a core enabling and versatile data publication technology. This approach fosters cross-consortial collaboration and reuse of technical components, while aligning with European open science initiatives—strengthening Germany's role in building a federated, sustainable, and FAIR research data ecosystem.

Files

CoRDI_2025_paper_142.pdf

Files (69.9 kB)

Name Size Download all
md5:9ae6011e486baf6defaede749ebb34e5
69.9 kB Preview Download