B1MG D3.2 Best practices for Next Generation Sequencing

Determining variants in the genome involves several bioinformatic procedures, such as eliminating low-quality sequences, aligning sequencing reads to the human reference genome, and establishing confidence in the presence of a variant based on a threshold. Once a variant is identified, it is annotated to predict its effect. The goal of this task is to establish a best practice protocol for data analysis of whole genome sequencing (WGS) for somatic variants. This protocol will include a recommended suite of software tools with settings that ensure results surpass a required quality threshold.

As part of the 1+MG WG4 project, we are currently benchmarking quality metrics to assess the best standards in the practice of WGS for somatic variants. At the time of writing, we have completed the initial steps of the benchmark and evaluated the quality metrics corresponding to the library preparation and sequencing steps. We compared the performance of sequencing conducted by participating laboratories by examining the value and dispersion of relevant metrics, providing a first assessment of how different sequencing protocols impact the sequences produced. The sequencing laboratories got an overall performance score, and based on the protocols followed by the facilities producing the best results we suggest which are the best practices for library preparation and sequencing.

A general conclusion is that all participants have achieved a good level of quality at the sequencing stage, and the metrics measuring it are largely consistent, as there is very little dispersion. Differences in library preparation and sequencing protocols do not appear to significantly impact the expected quality of results. In the following sections, we explain the performance of different participating laboratories for each one of the relevant sequencing metrics, and suggest general best practices to ensure the best quality.

This deliverable has been significantly delayed due to an important backlog in the preparation, examination, and approval of all ethical requirements to protect sensitive patient data. We made considerable effort in drafting the material transfer agreement (MTA) to guarantee that patient data will be used with all precautions and strict anonymity because genomic data is identifiable.


