Published August 8, 2025 | Version v1
Project deliverable Open

GDI D8.9 - Distributed analysis and federated learning PoC

  • 1. ROR icon Barcelona Supercomputing Center
  • 2. ROR icon Central European Institute of Technology
  • 1. ROR icon German Cancer Research Center
  • 2. ROR icon National Institute of Health Dr. Ricardo Jorge
  • 3. ROR icon Instituto Superior Técnico
  • 4. ROR icon University of Maribor
  • 5. ROR icon Masaryk University
  • 6. ROR icon Barcelona Supercomputing Center
  • 7. ROR icon University of Tartu
  • 8. EDMO icon CSC, IT Center for Science Ltd.
  • 9. ELIXIR Hub
  • 10. Danish National Genome Center

Description

This deliverable describes the design, implementation, and testing of a demonstrator for federated analysis within the European Genomic Data Infrastructure (GDI) project. It showcases a practical execution of a privacy-preserving Genome-Wide Association Study using a distributed architecture. Each participating national node (CZ, DE, PT, ES, SI, FI, EE) ran analyses locally, preserving data sovereignty and complying with legal constraints such as the GDPR. This demonstrator was built upon prior strategic and technical evaluations (previous deliverables and discussion from both WP7 and WP8) of federated technologies, validating the technical feasibility of GA4GH-compliant TES workflows. It leverages tools such as Snakemake, Docker, and MinIO, with Funnel serving as the TES endpoint. This prototype, using synthetic1 data previously distributed across the national nodes and generated for this demonstrator, paves the way for federated learning and cross-border analytics at scale.

Files

202504 - GDI_D8.9 Distributed analysis and federated learning PoC.pdf

Files (3.4 MB)

Additional details

Funding

European Commission
European Genomic Data Infrastructure (GDI) 101081813