Published December 16, 2024 | Version v1
Software Open

Segpy: a streamlined, user-friendly pipeline for variant segregation analysis

Description

Segpy is a streamlined, user-friendly pipeline designed for variant segregation analysis, allowing investigators to compute allelic counts at variant sites across study subjects. The pipeline can be applied to both pedigree-based family cohorts — those involving single or multi-family trios, quartets, or extended families — and population-based case-control cohorts. Considering the scale of modern datasets and the computational power required for their analysis, the Segpy pipeline was designed for seamless integration with the users’ high-performance computing (HPC) clusters.
 
As input, users must provide a single VCF file describing the genetic variants of all study subjects and a pedigree file describing the familial relationships among those individuals (if applicable) and their disease status. As output, Segpy computes variant carrier counts for affected and unaffected individuals, both within and outside of families, by categorizing wild-type individuals, heterozygous carriers, and homozygous carriers at specific loci. These counts are organized into a comprehensive data frame, with each row representing a single variant and labeled with the Sample IDs of the corresponding carriers to facilitate donwstream analysis.
 
 

Files

segpy.pip.zip

Files (1.1 GB)

Name Size Download all
md5:db00ae447179392e9cbca0a86c68adbc
1.1 GB Preview Download

Additional details

Software

Repository URL
https://github.com/neurobioinfo/segpy
Programming language
HTML, JavaScript, Python, Shell, CSS