Published August 2, 2023 | Version v1
Project deliverable Open

European Genomic Data Infrastructure project (GDI) D8.4 Report on federated data access scenarios

  • 1. CRG

Description

The goal of the “Genomic Data Infrastructure'' project is to generate a framework and all the tools to share human genomic data and associated phenotypical and clinical data. European and national regulations require that these types of data are protected.

Sharing of sensitive data needs to comply with law that implies the use of secure storage and authorised access. In a federated scenario there could be several ways data could be accessed by interested researchers. We need to distinguish, for example, between cases in which the data is accessed and analysed remotely in the storing facility versus the cases in which the data is downloaded to the requester’s premises. Other factors to be considered  include type of data, type of permissions (full excess to the data or partial, meaning access only to a section of the genomic sequences), types of analysis and types of federated approach in which we are situated (relation among the nodes forming the network). The scope of this deliverable is to analyse, describe and compare the different possibilities to access data in a federated context like the one GDI is building. In this document we provide an analysis of key factors to consider, including:

  1. Remote and virtualized querying of data 

  2. Remote and virtualized access and visualisation of data 

  3. Running pipelines or workflows at the node hosting the data

  4. Running interactive environments

  5. Downloading of data to the requester premises or to the cloud  

  6. Access to aggregated results from federated analyses

Existing tools for data access include  graphical user interfaces (GUI), command line interfaces, interactive environment, download and copy of data. All these could be used for access individual-level or aggregated data, and that would have different legal and technical  implications. The comprehensive landscape description, including all considerations of these factors will inform and instruct GDI Pillar II technological design and development.  Thus, the consideration listed here will serve as bases for the subsequent work within GDI.

Notes

GDI project receives funding from the European Union's Digital Europe Programme under grant agreement number 101081813.

Files

202306 - GDI_D8.4 Report on federated data access scenarios.pdf

Files (1.3 MB)