D5.6 Guidelines for data governance
Description
This deliverable sets the foundation for a coherent, practical, and future-proof data governance framework for the LUMI AI Factory (LUMI AIF). It clarifies scope, principles, roles, and processes to ensure that data used across LUMI AIF services is managed responsibly, securely, and ethically throughout its lifecycle, from acquisition and storage to processing, sharing, longterm preservation, or disposal. The document defines what the guidelines will cover and how they will be published, maintained, and improved over time, rather than reproducing the full guidance itself. The guidelines adopt a lifecycle-oriented and risk-aware approach that aligns with FAIR data practices and European regulatory expectations. They emphasise rigorous quality and integrity controls, comprehensive metadata, documentation, and traceability, and the “datasheets for datasets” methodology to increase transparency, reproducibility, and responsible reuse, particularly in the context of training, validation, and testing data for AI systems. Ethical data handling is treated as a continuous responsibility that addresses human impact, misuse prevention, and transparent governance of data value, supported by clear ownership of processing activities, robust lineage, and auditable change control.Security, privacy, and compliance are integral to the framework. The guidelines embed GDPR principles, role-appropriate responsibilities for controllers and processors, and explicit handling requirements for special category personal data and other sensitive data. The guidelines are complemented by ISO/IEC 27001-aligned security controls, NIS2 alignment, and HPC-specific operational safeguards. LUMI AIF also provides sensitive data processing environments and expert support, with templates and onboarding procedures that streamline due diligence for projects admitted via EuroHPC JU and national access routes.
Interoperability and standards ensure that data, models, and services can participate in European research ecosystems. Interoperability and alignment with common European standards and frameworks for data informs choices of open, machine-readable formats, semantic technologies, and domain ontologies, enabling portability and crossborder collaboration while reducing integration costs. A defined governance structure allocates accountability and operational responsibilities across the consortium. LUMI Docs is designated as the authoritative, version-controlled publication platform for all guidelines, with downloadable PDF versions for citation and offline use. An AI-based query agent is being developed to improve findability and user support. The platform will also link to partner organisations’ policies and related materials to ensure coherence with institutional and national requirements. A structured maintenance model, governance oversight, scheduled annual reviews, targeted ad hoc updates, semantic versioning, and formal changelogs, ensures the guidelines remain accurate and responsive to evolving law, standards, and service needs. Feedback channels and training and capacity building activities support adoption, while related processes (project-level DMP integration, compliance audits, incident management, and policy alignment for ethical AI and HPC use) connect governance to day-to-day operations. In sum, D5.6 delivers a clear plan for publishing, maintaining, and operationalising data governance across the LUMI AI Factory. It enables responsible innovation by combining FAIR-aligned lifecycle management, ethical and secure data practices, and interoperability with European frameworks, supported by a durable publication platform and a consortium-wide governance model that prioritises accountability, transparency, and continuous improvement.
Files
LUMI_AIF_DEL_WP5_D5.6_1.0.pdf
Files
(610.7 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:991ac41275479431c21c9f0f20a38ade
|
610.7 kB | Preview Download |