Published August 11, 2024
| Version v1
Dataset
Open
Comprehensive Structural Variant Benchmark Dataset: 1100 VCF files from long-read sequencing of 10 NCBI individuals
Description
We initially collected 10 NCBI individuals: HG002 family pedigree data (HG002 [son], HG003 [father], HG004 [mother]), the HG005 family pedigree data (HG005 [son], HG006 [father], HG007 [mother]), the NA12878 subject, the HG00096 subject, the HG00512 subject and the CHM13 subject. Then we used PacBio (CLR: Continuous Long Read, CCS: Circular Consensus Sequencing) and Nanopore (ONT) platforms, 5 aligners and 10 callers to construct the pipelines, with most parameters set to default values. After that, except for 6 invalid pipelines(pbmm2-Nanovar, lra-Picky, lra-delly, lra-NanoVar, lra-NanoSV, lra-pbsv), we obtain 1100 VCF files.
Files
1100VCF.zip
Files
(15.2 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:9fa003148eab0b7e8770cd02b7b03945
|
15.2 GB | Preview Download |