Phasing Structural Variants with Long Reads and Phased SNP Data
Description
We built this tool to work with long read data and population phased SNPs to phase SV data. In particular, it takes long read DNA data (premapped to the genome), a vcf with SV calls (produced using existing pipelines from the long read data or other sources), and a vcf with SNP data for the same individuals that has already been phased (likely with population based phasing methods). The output is a VCF with phased SVs in it for one individual, phased in such a way as to be consistant with the phased SNPs. Note the output VCF only includes heterozygous SVs and those that are successfully phased (those that are not successfully phased could be for many reasons, ranging from the tool used to recall the SVs (Sniffles currently) to incorrectly genotyped SVs, to issues with SNP phasing)
Files
PhaseLongRead-main.zip
Files
(9.3 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:aa5e157ce5b2d1f5fd0555adeac71414
|
9.3 MB | Preview Download |
Additional details
Funding
- Aligning Science Across Parkinson's
- Parkinson5D: deconstructing proximal disease mechanisms across cells, space, and progression ASAP-000301
Software
- Repository URL
- https://github.com/seanken/PhaseLongRead
- Programming language
- WDL , Python , Java