Published June 27, 2025 | Version Draft
Project deliverable Open

LUMEN Deliverable 4.1: LUMEN Data Mesh Architecture Framework

  • 1. Foxcub
  • 2. Net7 Srl
  • 3. ROR icon Poznan Supercomputing and Networking Center
  • 4. e-Science Data Factory
  • 5. ROR icon Know Center Research GmbH (Austria)
  • 1. Know-Center
  • 2. Know-Center GmbH
  • 3. ROR icon OPERAS
  • 4. Data Terra Research Infrastructure
  • 5. ROR icon Institut Jacques Monod
  • 6. Université Paris Cité
  • 7. CNRS
  • 8. ROR icon FZI Research Center for Information Technology
  • 9. ROR icon University of Bologna

Description

Scientific research is increasingly dependent on heterogeneous, distributed, and rapidly evolving datasets. Traditional centralized approaches, such as data warehouses or monolithic repositories, face significant limitations in scalability, interoperability, and cross-community collaboration. These paradigms often lead to data silos, making it difficult for researchers to discover, access, and reuse relevant information efficiently. In response to these challenges, LUMEN introduces a federated and decentralized data mesh architecture that facilitates seamless data discovery, sharing, and reuse across multiple scientific disciplines. This approach aligns with FAIR principles (Findable, Accessible, Interoperable, and Reusable), ensuring that scientific data remains structured, interoperable, and accessible within a governed framework. This document defines the architectural foundations of the LUMEN data mesh, outlining the vision, guiding principles, and operational mechanisms that will structure the technical implementation across the project.

At the core of LUMEN’s architecture are three fundamental layers. First, the Connected Communities Ecosystem ensures that each scientific domain maintains ownership and control over its data products—well-defined, FAIR-compliant data assets published for reuse, while adhering to common federation rules. This layer allows data providers to expose standardized, high-quality scientific resources such as datasets, software, author profiles or semantic artefacts while remaining autonomous in their governance. The second pillar, the Common Discovery Infrastructure, provides a suite of domain-agnostic discovery assets, including the White-Label Discovery Platform, the Meta-Search Engine, and AI-driven discovery services such as chatbots and recommendation engines. This layer also integrates with the LUMEN Infrastructure for Semantics (LUMIS), which facilitates the creation and management of FAIR-by-design semantic artefacts, supports metadata alignment ensuring semantic interoperability across the various data products. The third layer, Federated Data Sharing Governance, establishes common policies for data sharing, metadata structuring, and interoperability, ensuring alignment with EOSC standards and best practices.

LUMEN’s data mesh architecture supports several key technical capabilities and practical use cases, building upon principles pioneered in large-scale data ecosystems to decentralize data ownership and foster domain-driven interoperability. Its federated architecture enables meta-search and knowledge discovery, facilitating cross-disciplinary searches, semantic enrichment, and intelligent linking of scientific resources such as publications, datasets, and software. This enrichment goes beyond identifier-based connections: it aims to support concept-based navigation and semantic alignment, enabling users to understand relationships between variables, concepts, or entities across heterogeneous resources. While technically ambitious, this vision is progressively pursued through FAIR semantic artefacts and AI-assisted processes. A core element of this approach is the adoption of data contracts,  extending the data mesh paradigm by formalising a mechanism that was not explicitly defined in its early formulations. These contracts define the structure, access rules, and quality metrics of shared resources, enabling trusted, machine-readable agreements that ensure FAIR compliance and cross-domain interoperability. Furthermore, LUMIS enhances metadata quality and interoperability, while LUMEN’s progressive integration model supports the gradual adoption of APIs and structured data exchanges, allowing communities to transition smoothly from legacy infrastructures to modernized, interconnected systems.

LUMEN’s Data Mesh Framework is transformative. By promoting federated governance and progressive decentralisation of data ownership, this approach supports scalability and sustainability. It enables research communities to retain control over their own platforms and data, while benefiting from common discovery components that facilitate cross-domain discovery and alignment without centralisation.

Enhanced AI powered data discoverability fosters more efficient and transparent research workflows, enabling researchers to find, link, and reuse scientific resources across disciplines. This approach accelerates interdisciplinary research by facilitating seamless connections between various scientific fields, breaking down long-standing barriers between disciplines.

As LUMEN moves forward, the next steps will focus on the standardization of metadata models and governance rules, leading to the definition of the LUMEN Data Model (D4.2). The architecture will be validated through pilot implementations of connected discovery platforms, allowing communities to test and refine their integration with the data mesh. Additionally, progressive alignment with EOSC services will further enhance interoperability and cross-disciplinary research collaboration. By providing a structured yet flexible framework for scientific data federation, LUMEN’s data mesh ensures the long-term impact and sustainability of data discovery, standardization, and knowledge exchange across the research ecosystem.

 

This document defines the architectural foundations of the LUMEN data mesh, outlining the vision, guiding principles, and operational mechanisms that will structure the technical implementation across the project. 

This deliverable is currently under review by the EC.

Files

LUMEN_D4.1_LUMEN_Data_Mesh_Architecture_Framework_V1.pdf

Files (2.8 MB)

Additional details

Funding

European Commission
LUMEN - Linked User-driven Multidisciplinary Exploration Network 101187940