Comprehensive Structural Variant Benchmark Dataset: 1100 VCF files from long-read sequencing of 10 NCBI individuals

Peng, Wenjie

doi:10.5281/zenodo.13293672

Published August 11, 2024 | Version v1

Dataset Open

Comprehensive Structural Variant Benchmark Dataset: 1100 VCF files from long-read sequencing of 10 NCBI individuals

Peng, Wenjie (Contact person)¹

1. Sun Yat-sen University

We initially collected 10 NCBI individuals: HG002 family pedigree data (HG002 [son], HG003 [father], HG004 [mother]), the HG005 family pedigree data (HG005 [son], HG006 [father], HG007 [mother]), the NA12878 subject, the HG00096 subject, the HG00512 subject and the CHM13 subject. Then we used PacBio (CLR: Continuous Long Read, CCS: Circular Consensus Sequencing) and Nanopore (ONT) platforms, 5 aligners and 10 callers to construct the pipelines, with most parameters set to default values. After that, except for 6 invalid pipelines(pbmm2-Nanovar, lra-Picky, lra-delly, lra-NanoVar, lra-NanoSV, lra-pbsv), we obtain 1100 VCF files.

Files

1100VCF.zip

Files (15.2 GB)

Name	Size	Download all
1100VCF.zip md5:9fa003148eab0b7e8770cd02b7b03945	15.2 GB	Preview Download

167

Views

Downloads

Show more details

	All versions	This version
Views	167	167
Downloads	38	38
Data volume	594.1 GB	594.1 GB

More info on how stats are collected....

DOI

Resource type

Dataset

Publisher

Zenodo

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: August 11, 2024
Modified: August 5, 2025

Comprehensive Structural Variant Benchmark Dataset: 1100 VCF files from long-read sequencing of 10 NCBI individuals

Authors/Creators

Description

Files

1100VCF.zip

Files (15.2 GB)