Haplotype-aware diplotyping from noisy long reads
Authors/Creators
- 1. Max Planck Institute for Informatics, Saarbruecken, Germany; Center for Bioinformatics, Saarland University, Saarbruecken, Germany
- 2. UC Santa Cruz Genomics Institute, University of California Santa Cruz, USA
Description
SNP calls for individual NA12878 produced by MarginPhase and WhatsHap on PacBio and Oxford Nanopore data.
Paper Abstract: Current genotyping approaches for single nucleotide variations rely on short, accurate reads from second generation sequencing devices. Presently, third generation sequencing platforms are rapidly becoming more widespread, yet approaches for leveraging their long but error-prone reads for genotyping are lacking.
Here, we introduce a novel statistical framework for the joint inference of haplotypes and genotypes from noisy long reads, which we term diplotyping. Our technique takes full advantage of linkage information provided by long reads. We validate hundreds of thousands of candidate variants that have not yet been included in the high-confidence reference set (NA12878) of the Genome-in-a-Bottle effort.
Files
Files
(523.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:df75832ef1a52be1a55f80717dd60ec4
|
103.3 MB | Download |
|
md5:3fa35a2f37e63285cbadb1a315adae7a
|
1.4 MB | Download |
|
md5:5c434ba655f6a9fbfb49500af6804ca7
|
52.4 MB | Download |
|
md5:212545e8271dd2beab1f7a5ffde0ea38
|
1.5 MB | Download |
|
md5:8bbb441e7bd80d7dc282ca18e0550920
|
83.5 MB | Download |
|
md5:e446d6bbec3e1545fedadfd1868dcf94
|
1.5 MB | Download |
|
md5:9083ba35de7b8101f9425f39d62aae50
|
145.6 MB | Download |
|
md5:2111c74504346ebdbfda1727d70b5769
|
1.6 MB | Download |
|
md5:9213f47a71a3c8f3e6a16046af5f0901
|
130.6 MB | Download |
|
md5:2ce263fd7147539e057f9cabc1e615ca
|
1.5 MB | Download |