Published June 13, 2024 | Version v1
Project deliverable Open

GDI D7.1 - Use case demonstrator package

  • 1. ROR icon Barcelona Supercomputing Center
  • 1. ROR icon Barcelona Supercomputing Center
  • 2. ROR icon Istituti di Ricovero e Cura a Carattere Scientifico
  • 3. UB
  • 4. ROR icon University of Helsinki
  • 5. BioData.pt
  • 6. HSR
  • 7. ROR icon Centro Nacional de Análisis Genómico
  • 8. ROR icon University of Tartu
  • 9. UL school of medicine
  • 10. ROR icon Erasmus MC

Description

This deliverable focuses on creating a use case demonstrator package, which includes an initial set of data collections tailored for specific use case scenarios within the European Genomic Data Infrastructure (GDI). Given the current lack of real data in the GDI nodes, efforts were concentrated on gathering synthetic and other available real data that closely align with intended use cases to facilitate comprehensive testing of the GDI infrastructure.

Seven datasets have been identified and made accessible for GDI use cases, with five specifically tailored to GDI requirements and two offering more generalised data. An additional six datasets are being generated and are expected to be available soon. Key datasets include the Genome of Europe, featuring real data from the Genome of the Netherlands project, and 1+MG/B1MG use cases, offering synthetic datasets for rare diseases.

The identified datasets are ready for request and usage within the GDI nodes. Most datasets contain Variant Calling Format (VCF) files essential for discoverability via Beacon v2. Additionally, these datasets have been used in the MS7 demonstrator to assist the nodes in testing the infrastructure.

Mapping the datasets to key questions ensures they address essential inquiries for different use cases, such as genetic variant lookup, recalibrating polygenic risk scores, and medication side effects. In addition, different federated processing scenarios could be tested using this data, like screening for common variants across populations and processing cancer datasets with variant calling workflows.

This collection of datasets will allow Pillar II to test the infrastructure, making sure that it is robust and usable by all GDI nodes.

Files

202402 - GDI_D7.1 Use case demonstrator package.docx.pdf

Files (620.9 kB)

Additional details

Funding

European Genomic Data Infrastructure (GDI) 101081813
European Commission

Dates

Submitted
2024-05-31