Dataset Open Access

SNP and indel discovery and genotyping in next-generation sequencing data

Gilks, William

Code, logs and data for discovery and genotyping of SNPs and indels, in the the D.melanogaster genome, using GATK HaplotypeCaller. Code is in the zipped folder named code.zip. Run logs for this code as in the zipped folder named logs.zip. The unfiltered vcf genotypes file is named lhm_rg_HC_2015-09-15.vcf.gz. The filtered vcf genotypes file is named f1.lhm_rg_HC_raw.vcf.gz. The vcf submitted to NCBI dbSNP (filtered, and with indels >50bp and variants with null alternate alleles both removed) is named dbSNP.lhm_rg_HC_raw.vcf.gz. The folder local_reference.zip contains the reference assembly files against which genotypes were called against, and includes the code used to format the data prior to use. Also included is genotypes data from the two in-house reference line samples sequenced (BDGP6+ISO1 mito/dm6, Bloomington Drosophila Stock Center no. 2057)

Samples are 220 Sussex-LHM hemiclones, and 2 RG. The first run did not include chromosome 4 and the mitochondrial genome, so these were genotyped separately, and then added to the rest of the results.

The link for the NCBI dbSNP record is currently https://www.ncbi.nlm.nih.gov/projects/SNP/snp_viewBatch.cgi?sbid=1062461and the submitter handle is MORROW_EBE_SUSSEX.

At the time of writting, the NCBI D.melanogaster build is still being updated, and therefore ss identifiers, but not rs identifers are available.

The pre-print manuscript for this data is available on biorxiv: "Whole genome resequencing of a laboratory-adapted Drosophila melanogaster population sample" http://biorxiv.org/content/early/2016/10/17/081554 doi: http://dx.doi.org/10.1101/081554

Files (7.8 GB)
Name Size
code.zip
md5:48c781b4ad2b7d57d88c2a83dde16c03
6.8 kB Download
dbSNP.lhm_rg_HC_raw.vcf.gz
md5:67108750de5d38acea322ae503f2a982
2.3 GB Download
f1.lhm_rg_HC_raw.vcf.gz
md5:04b9ceb95bc326da5d6b622e5c1f19a0
2.6 GB Download
lhm_rg_HC_2015-09-15.vcf.gz
md5:6893aaf4d0a22b03f086634f5122c92d
2.7 GB Download
local_reference.zip
md5:b48ca4ce7d1dbb6b49cf96cbd15eb756
45.2 MB Download
logs.zip
md5:c050e67c73985bcb797b9268c8275cac
211.6 kB Download
RG.vcf.gz
md5:ec69be1f6531d1e197642f55dc814de5
757.0 kB Download
summary_data.zip
md5:d6173e2a6aab873e93910427e63840f8
64.0 MB Download
VarIDs_lhm_rg_HC.txt
md5:448d08dcaf6f4607a2df8ee1adfa09fd
18.0 MB Download
332
94
views
downloads
All versions This version
Views 332333
Downloads 9494
Data volume 90.4 GB90.4 GB
Unique views 314315
Unique downloads 7171

Share

Cite as