There is a newer version of the record available.

Published December 18, 2018 | Version v2
Preprint Open

Haplotype-aware diplotyping from noisy long reads

  • 1. Max Planck Institute for Informatics, Saarbruecken, Germany; Center for Bioinformatics, Saarland University, Saarbruecken, Germany
  • 2. UC Santa Cruz Genomics Institute, University of California Santa Cruz, USA

Description

SNP calls for individual NA12878 produced by MarginPhase and WhatsHap on PacBio and Oxford Nanopore data.

Paper Abstract: Current genotyping approaches for single nucleotide variations rely on short, accurate reads from second generation sequencing devices. Presently, third generation sequencing platforms are rapidly becoming more widespread, yet approaches for leveraging their long but error-prone reads for genotyping are lacking.
Here, we introduce a novel statistical framework for the joint inference of haplotypes and genotypes from noisy long reads, which we term diplotyping. Our technique takes full advantage of linkage information provided by long reads. We validate hundreds of thousands of candidate variants that have not yet been included in the high-confidence reference set (NA12878) of the Genome-in-a-Bottle effort.

Files

Files (523.1 MB)

Name Size Download all
md5:df75832ef1a52be1a55f80717dd60ec4
103.3 MB Download
md5:3fa35a2f37e63285cbadb1a315adae7a
1.4 MB Download
md5:5c434ba655f6a9fbfb49500af6804ca7
52.4 MB Download
md5:212545e8271dd2beab1f7a5ffde0ea38
1.5 MB Download
md5:8bbb441e7bd80d7dc282ca18e0550920
83.5 MB Download
md5:e446d6bbec3e1545fedadfd1868dcf94
1.5 MB Download
md5:9083ba35de7b8101f9425f39d62aae50
145.6 MB Download
md5:2111c74504346ebdbfda1727d70b5769
1.6 MB Download
md5:9213f47a71a3c8f3e6a16046af5f0901
130.6 MB Download
md5:2ce263fd7147539e057f9cabc1e615ca
1.5 MB Download