Published March 23, 2021 | Version v2
Dataset Open

Semi-simulated dataset of indel calling tool evaluation

Authors/Creators

  • 1. University of Turku

Description

The semi-simulated datasets used in Ning Wang, et al. 

100bp_5X_500_Venter_read1.fq.gz & 100bp_5X_500_Venter_read2.fq.gz: 5X coverage, 100bp read length semi-simulated paired-end sequencing FASTQ files.

100bp_30X_500_Venter_read1.fq.gz & 100bp_30X_500_Venter_read2.fq.gz: 30X coverage, 100bp read length semi-simulated paired-end sequencing FASTQ files.

250bp_30X_500_Venter_read1.fq.gz & 250bp_30X_500_Venter_read2.fq.gz: 30X coverage, 250bp read length semi-simulated paired-end sequencing FASTQ files.

100bp_60X_500_Venter_read1.fq.gz & 100bp_60X_500_Venter_read2.fq.gz: 60X coverage, 100bp read length semi-simulated paired-end sequencing FASTQ files.

Haplotype_1_no_gap.fa & Haplotype_2_no_gap.fa: the semi-simulated dipoid human genome hg19 chromosome 1 and chromosome 2, including HuRef indels used in Ning Wang, et al. In order to use ART (fastq simulator) to generate simulated fastq fiels, the gaps (Ns) of genome were removed.

chr1_chr2_variants_truthset.txt: types, position, size and genotype of HuRef indels used in Ning Wang, et al. 

Files

chr1_chr2_variants_truthset.txt

Files (57.1 GB)

Name Size Download all
md5:275012136b5671574727b266502be62a
6.8 GB Download
md5:b764e6f0081f3d2fd9d2dc1166ac00c2
7.0 GB Download
md5:b0d8f4e93a298481e41048285eda4f7f
1.1 GB Download
md5:ba14dd8f66d2fb533671f900acfde831
1.2 GB Download
md5:3708acd9f5bab98daa474156d0eb1772
13.7 GB Download
md5:d386d54a74d1cb705b4d3e7faf8cdd7d
13.9 GB Download
md5:20d5b64d197453c618712cd1e6f1407b
5.9 GB Download
md5:8e424c62d817a18c1824749ffae256e2
6.7 GB Download
md5:77336c29dc59c1ed706c6ac78c10e39b
4.7 MB Preview Download
md5:873c3ba0e4a651471934111c309e0ff0
463.5 MB Download
md5:2f94e1dbb5f63f58df3c741e267d0c1d
463.5 MB Download