ELIXIR-CONVERGE D8.3 Raw sequence data processing workflow in operation
Description
This deliverable describes raw sequence data processing workflows available for SARS-CoV-2 data hub users and systematic analysis of public data in the COVID-19 Data Portal (DP). Workflows are focused at generating consensus sequences and variant calls from raw read data. This generates a complete data product, with provenance with regards to data owners, data types and processing workflows. The data product is then consumable by the scientific research community, for example epidemiologists who may be interested in the tracking of SARS-CoV-2 infection and specific variants. This deliverable does not cover work on SARS-CoV-2 phylogeny, which has been described in deliverable 8.4, however this is an additional aspect/workflow that is available within the data hubs and utilised for systematically analysis of submitted sequence datasets within the COVID-19 DP. The SARS-CoV-2 data hubs are toolboxes and spaces for users to share data (in a pre-publication or 1 public manner), in some cases with collaborators based in different institutes, automatically process the shared data (through the data hub configuration), with resulting analysis returned back to the data hub for interpretation by users and their collaborators. In some cases (depending on the workflow used to analyse), interactive Jupyter notebooks are also available for users.
Files
ELIXIR-CONVERGE D8.3 Raw sequence data processing workflow in operation.pdf
Files
(218.7 kB)
Name | Size | Download all |
---|---|---|
md5:ced8a07f346173efe6011f6302c74dee
|
218.7 kB | Preview Download |