Published August 12, 2020 | Version v0.1.1

Haplotype counting script

Authors/Creators

  • 1. VIB-UGent Center for Plant Systems Biology, Gent, Belgium

Description

A simple script for counting of a number of haplotypes formed by pre-selected SNPs in long-read sequencing data. The number of haplotypes is counted per 1kb in individual long sequencing reads aligned to a reference, based on the position, reference and alternative alleles of reliable SNPs called on short-read sequencing data

Required input files: 

- processed long sequencing reads aligned to the reference and split per chromosome/contig into individual files - the names must correspond with the respective vcf files 

- .table files containing information about reliable  biallelic SNPs split per chromosome/contig into individual files, obtained from original vcf files - the names must correspond with the respective bam files 

- file with names for each chromosome/contig -  has to be the same as names for bam and vcf files

Required software: 
- Java version 1.8 or higher

- downloaded and compiled sam2tsv.jar from jvarkit https://github.com/lindenb/jvarkit

 

Notes

This work was supported by Erwin Schrödinger fellowship from Austrian Science Fund (FWF) (project number J3692-B22)

Files

Files (11.1 kB)

Name Size Download all
md5:b998af3ffc69670783b53d411357b83a
11.1 kB Download