Published February 29, 2024 | Version v1
Project deliverable Open

D4.6 Definition of Data Quality Metrics

  • 1. i-HD
  • 1. i-HD
  • 2. B!Loba
  • 3. ROR icon Maastricht University
  • 4. Midata

Description

Reusing poor quality data has limited value. When developing the requirements for the AIDAVA
curation virtual assistant, data users repeatedly asked the same question: how reliable the data is.
The answer differs depending on the state of the data: i) for data sources, a quality label can be
established based on the quality level provided by the data holder — if available — including the
credentials of the persons who created and validated the data; ii) for the curated data (i.e. the PHKG),
the quality label will be linked to the quality from the source, the level of quality and certification of
the curation tools used during transformation, the level of health and literacy of the humans who
provided answers when there were semantic gaps, and the number of data quality checks that could
not be resolved; iii) for published data, the quality label will be linked to the level of the curated data,
the compliance with the target format, the completeness of the content, the absence of bias as well
as the quality, reliability and certification of the imputation algorithm, if applicable.
This document provides a detailed overview of AIDAVA deliverable 4.6, focusing on data quality and
metadata across the health data life cycle. This deliverable serves as a key component in AIDAVA,
aimed at developing a comprehensive data quality assessment methodology. This methodology is
crucial for ensuring the reliability, transparency, and effective reuse of health data. The document
highlights the importance of maintaining high standards of health data quality and incorporates data
quality dimensions, methodologies, and tools. Furthermore, deliverable 4.6 is linked with other
integral parts of the project, namely deliverables 1.3 (Business requirements for R1) [1], 1.4
(Definition of assessment study including test scenarios & metrics, and study initiation package) [2] ,
2.1 (Global data sharing standard) [3], and 2.2 (Details on data curation & publishing process)
(deliverable on request). These deliverables introduce SHACL (Shapes Constraint Language) rules and
specific data quality guidelines, contributing for establishing data quality practices.

Files

Attachment_0 (22).pdf

Files (2.0 MB)

Name Size Download all
md5:512839b90a5d2c6e9da81e28cb89c9f8
2.0 MB Preview Download

Additional details

Funding

European Commission
AIDAVA - AI powered Data Curation & Publishing Virtual Assistant 101057062