Published March 2025 | Version v1
Conference paper Open

Towards the Automation of Data Space Product through Quality Data Pipelines

  • 1. Fundación Tecnalia Research & Innovation
  • 2. ROR icon Universidad de Deusto
  • 3. Universidad Carlos III de Madrid - Campus de Leganés
  • 4. ROR icon Tecnalia
  • 5. ROR icon Centre for Research and Technology Hellas

Description

As data becomes a valuable asset for organizations, the challenge is no longer gathering vast amounts of information but refining and managing it to generate value. To this end, there is a growing importance of transforming raw data into high-quality data products within Data Spaces, which are critical components of modern digital ecosystems. The complexity lies not only in the diversity of data sources, formats, and systems but also in the need for data products to remain adaptable and interoperable across various environments. On top of this, Data Spaces often require strict adherence to specific syntaxes and structures. In addition, poor data quality undermines trust and decision-making, and the lack of clear frameworks for processing and consuming data products within these spaces adds technical overhead. The main contribution of this manuscript is a reference architecture designed to facilitate the creation of high-quality, interoperable data products within Data Spaces. Additional contributions include an analysis of the required data types to ensure compatibility with real-world use cases, as well as addressing issues related to data quality, interoperability, and technical integration. The paper concludes with a discussion of future works and potential improvements.

Files

TOWARDS_THE_AUTOMATION_OF_DATA_SPACE_PRODUCTS_THROUGH_QUALITY_DATA_PIPELINES1.pdf

Additional details

Funding

European Commission
PLIADES - AI-Enabled Data Lifecycles Optimization and Data Spaces Integration for Increased Efficiency and Interoperability 101135988