Published February 6, 2026 | Version v1
Software Open

Phasing Structural Variants with Long Reads and Phased SNP Data

  • 1. ROR icon Broad Institute

Description

We built this tool to work with long read data and population phased SNPs to phase SV data. In particular, it takes long read DNA data (premapped to the genome), a vcf with SV calls (produced using existing pipelines from the long read data or other sources), and a vcf with SNP data for the same individuals that has already been phased (likely with population based phasing methods). The output is a VCF with phased SVs in it for one individual, phased in such a way as to be consistant with the phased SNPs. Note the output VCF only includes heterozygous SVs and those that are successfully phased (those that are not successfully phased could be for many reasons, ranging from the tool used to recall the SVs (Sniffles currently) to incorrectly genotyped SVs, to issues with SNP phasing) 

Files

PhaseLongRead-main.zip

Files (9.3 MB)

Name Size Download all
md5:aa5e157ce5b2d1f5fd0555adeac71414
9.3 MB Preview Download

Additional details

Funding

Aligning Science Across Parkinson's
Parkinson5D: deconstructing proximal disease mechanisms across cells, space, and progression ASAP-000301

Software

Repository URL
https://github.com/seanken/PhaseLongRead
Programming language
WDL , Python , Java