10.5281/zenodo.7912923
https://zenodo.org/records/7912923
oai:zenodo.org:7912923
Gut, Ivo
Ivo
Gut
CRG
Aguileta, Gabriela
Gabriela
Aguileta
CRG
Cuppen, Edwin
Edwin
Cuppen
HMF
Wirta, Valtteri
Valtteri
Wirta
KI
Hovig, Eivind
Eivind
Hovig
UiO
Matthijs, Gert
Gert
Matthijs
KU Leuven
B1MG D3.2 Best practices for Next Generation Sequencing
Zenodo
2023
B1MG
Beyond 1 Million Genomes
1+MG
1 Million Genomes Initiative
ELIXIR
NGS
Next Generation Sequencing
WGS
Whole Genome Sequencing
Best practice
2023-05-09
eng
Project deliverable
10.5281/zenodo.7912922
https://zenodo.org/communities/b1mg
https://zenodo.org/communities/eu
Creative Commons Attribution 4.0 International
Determining variants in the genome involves several bioinformatic procedures, such as eliminating low-quality sequences, aligning sequencing reads to the human reference genome, and establishing confidence in the presence of a variant based on a threshold. Once a variant is identified, it is annotated to predict its effect. The goal of this task is to establish a best practice protocol for data analysis of whole genome sequencing (WGS) for somatic variants. This protocol will include a recommended suite of software tools with settings that ensure results surpass a required quality threshold.
As part of the 1+MG WG4 project, we are currently benchmarking quality metrics to assess the best standards in the practice of WGS for somatic variants. At the time of writing, we have completed the initial steps of the benchmark and evaluated the quality metrics corresponding to the library preparation and sequencing steps. We compared the performance of sequencing conducted by participating laboratories by examining the value and dispersion of relevant metrics, providing a first assessment of how different sequencing protocols impact the sequences produced. The sequencing laboratories got an overall performance score, and based on the protocols followed by the facilities producing the best results we suggest which are the best practices for library preparation and sequencing.
A general conclusion is that all participants have achieved a good level of quality at the sequencing stage, and the metrics measuring it are largely consistent, as there is very little dispersion. Differences in library preparation and sequencing protocols do not appear to significantly impact the expected quality of results. In the following sections, we explain the performance of different participating laboratories for each one of the relevant sequencing metrics, and suggest general best practices to ensure the best quality.
This deliverable has been significantly delayed due to an important backlog in the preparation, examination, and approval of all ethical requirements to protect sensitive patient data. We made considerable effort in drafting the material transfer agreement (MTA) to guarantee that patient data will be used with all precautions and strict anonymity because genomic data is identifiable.
European Commission
10.13039/501100000780
951724
Beyond 1M Genomes