Published July 27, 2021 | Version v1
Project deliverable Open

ELIXIR-CONVERGE D8.3 Raw sequence data processing workflow in operation

  • 1. EMBL-EBI

Description

This deliverable describes raw sequence data processing workflows available for SARS-CoV-2 data hub users and systematic analysis of public data in the COVID-19 Data Portal (DP). Workflows are focused at generating consensus sequences and variant calls from raw read data. This generates a complete data product, with provenance with regards to data owners, data types and processing workflows. The data product is then consumable by the scientific research community, for example epidemiologists who may be interested in the tracking of SARS-CoV-2 infection and specific variants. This deliverable does not cover work on SARS-CoV-2 phylogeny, which has been described in deliverable 8.4, however this is an additional aspect/workflow that is available within the data hubs and utilised for systematically analysis of submitted sequence datasets within the COVID-19 DP. The SARS-CoV-2 data hubs are toolboxes and spaces for users to share data (in a pre-publication or 1 public manner), in some cases with collaborators based in different institutes, automatically process the shared data (through the data hub configuration), with resulting analysis returned back to the data hub for interpretation by users and their collaborators. In some cases (depending on the workflow used to analyse), interactive Jupyter notebooks are also available for users.

Files

ELIXIR-CONVERGE D8.3 Raw sequence data processing workflow in operation.pdf

Files (218.7 kB)

Additional details

Funding

ELIXIR-CONVERGE – Connect and align ELIXIR Nodes to deliver sustainable FAIR life-science data management services 871075
European Commission