Published September 26, 2021 | Version v1
Poster Open

Core resources for genome analysis

  • 1. ROR icon Wellcome Sanger Institute

Description

The Wellcome Sanger Institute will sequence and assemble tens of thousands of genomes across the next decade through large-scale biodiversity sequencing projects such as Darwin Tree of Life. The projects span the entire tree of eukaryotic life, and the genomes will enable new and unprecedented science. But assembling a genome is just the first step on this journey. Most studies analyse features annotated on the genomes, rather than the raw DNA sequences themselves. To facilitate this, we will compute a set of elementary sequence analyses on every genome that comes out of the institute and provide the tracks on a publicly available server through Track Hubs. Envisaged tracks include sequence composition analysis (and by extension k-mer frequency analysis for several k), repeat, gene, and variants distributions. The goal is to save the community efforts and resources by providing a uniform dataset, whilst reducing the overall carbon footprint of genome analysis.

Files

2021-09-26 - Biodiversity Genomics 2021 - Core Analyses Pipelines.pdf

Files (1.4 MB)

Additional details

Funding

Wellcome Trust
Darwin Tree of Life 218328