Published August 4, 2025 | Version v1
Conference paper Open

NFDI4DSO Version 2.0.0: A BFO Compliant Ontology for Data Science

  • 1. FIZ Karlsruhe – Leibniz-Institut für Informationsinfrastruktur
  • 2. Fraunhofer FOKUS
  • 3. FIZ Karlsruhe, Leibniz Institute for Information Infrastructure & KIT Karlsruhe

Contributors

  • 1. Nationale Forschungsdateninfrastruktur (NFDI) e.V.
  • 2. University of Amsterdam

Description

Data Science (DS) is a multidisciplinary field that integrates mathematics, statistics, computer science and domain-specific knowledge to extract meaningful insights from diverse data sources, involving a variety of artifacts such as datasets, models, ontologies [1], code repositories, and execution platforms. The NFDI4DataScience (NFDI4DS) project aims to improve the FAIRness (Findable, Accessible, Interoperable, and Reusable) of research artifacts within the National Research Data Infrastructure (NFDI) framework. To achieve this, the initial NFDI4DS Ontology (NFDI4DSO Version 1.0.0) [2] was developed, based on the NFDICore Ontology Version 2.0 [3]. NFDI4DSO Version 1.0.0 primarily supports the Research Information Graph (RIG), which captures metadata about the resources, persons and organizations of the NFDI4DS consortium. In contrast, NFDI4DSO Version 2.0.0 significantly extends its focus beyond RIG by supporting the Research Data Graph (RDG), enabling the semantic representation and interlinking of diverse research data assets. NFDI4DSO Version 2.0.0 is built upon NFDICore Version 3.0.01, which is mapped to the Basic Formal Ontology (BFO) [4] to enable broader interoperability. This enhanced mapping ensures seamless integration across different research domains. The NFDI4DSO Version 1.0.0 ontology has been successfully used to create the first instance of the NFDI4DS Knowledge Graph (NFDI4DS-KG), providing a structured and semantically rich representation of research information within the consortium. Furthermore, it served as the foundational schema for developing a named entity recognition dataset (NER)2, to support downstream tasks such as information extraction and semantic annotation. Building on these applications, NFDI4DSO Version 2.0.0 is also planned to be utilized for similar purposes such as KG construction.

Files

CoRDI_2025_paper_288.pdf

Files (67.2 kB)

Name Size Download all
md5:07e696b87b665e7099781c30aa85cb7e
67.2 kB Preview Download