Published July 27, 2021 | Version v1
Project deliverable Open

ELIXIR-CONVERGE D8.4 Phylogenetic tools and enhanced results visualisation

Description

Deliverable Scope

Deliverable 8.4. “Phylogenetic tools and enhanced results visualisation” aims at improving navigation and visualisation tools for systematic viral data interpretation, including through phylogenetic trees. As part of task 8.2. Data analysis mobilisation (Objective: mobilisation of analysis upon SARS-CoV-2 sequence data), we intended to deploy the data processing and visualisation components of the SARS-CoV-2 Data Hubs system in order to mobilise viral sequence data analysis at scale.

Work accomplished

We produced and integrated into the COVID-19 data portal an interactive phylogenetic tree of SARS-CoV-2 consensus sequences data held in the ENA (and INSDC). This tree is built on the Evergreen/PhyloViz architecture developed by collaborators at the Technical University of Denmark (DTU), and integrated as part of the SARS-CoV-2 data hubs. A sequence TSV file is used by the Evergreen tree back end service. A cron-job retrieves this file daily and adds new samples to the MonoDB database. The system also runs a check to detect suspended samples that should be excluded from the tree, and allows this endpoint to be used by the Evergreen/PhyloViz tree generation service and close the circle of the two services.

Conclusion

The tree regularly updates when new SARS-CoV-2 sequences/genomes are shared with the International Nucleotide Sequence Database Collaboration (INSDC), ensuring only public data is included. The tree is accompanied by an appropriate metadata table and a world map illustrating the country of origin of samples represented in the tree. Improvements are being made to the phylogeny include integration of lineage information, variation information and performance updates.

Files

ELIXIR-CONVERGE D8.4 Phylogenetic tools and enhanced results visualisation.pdf

Additional details

Funding

ELIXIR-CONVERGE – Connect and align ELIXIR Nodes to deliver sustainable FAIR life-science data management services 871075
European Commission