NA12878 WES Benchmark dataset
Description
This dataset makes available the UCSC Genome Browser (genome.ucsc.edu) GRCh37 genome build public session NA12878 WES Benchmark files in a single dataset so that these files can be used in other applications or genome browsers such as IGV.
The "Procedure and datasets to cross-reference OMIM genes with the genomic regions of interest" Galaxy page on usegalaxy.org server's Shared Data Pages describes practical procedure and several possible use cases for this data set. This page can be accessed freely by users logged into their accounts on usegalaxy.org. Please register if you don't have an account on usegalaxy.org Galaxy server.
All genomic variant calls in all VCF files of this data set were decomposed and normalized with vt. This dataset contains:
- Genome in a bottle (GIAB) version 3.3.2 high confidence (HC) variant calls and genomic regions for HapMap individual NA12878 :
- GIAB_v3.3.2_NA12878-decomposed-normalized.vcf.gz
- GIAB_v3.3.2_NA12878-decomposed-normalized.vcf.gz.tbi
- GIAB_v3.3.2_NA12878_HC_regions.bed
- HapMap individual NA12878 WES variant calls (VCF) and capture regions (BED) from diagnostic laboratories :
- ARUP whole exome sequencing data (HiSeq 2000) publically available from NCBI GeT-RM Browser
- converted_ARUP_NA12878_Exome-decomposed-normalized.vcf.gz
- converted_ARUP_NA12878_Exome-decomposed-normalized.vcf.gz.tbi
- ARUP_SeqCap_EZ_Exome.bed
- UCSF whole exome sequencing data (HiSeq 2500) publically available from NCBI GeT-RM Browser
- converted_UCSF_NA12878_WES_Agilent_V4_Custom-decomposed-normalized.vcf.gz
- converted_UCSF_NA12878_WES_Agilent_V4_Custom-decomposed-normalized.vcf.gz.tbi
- UCSF_WES_Agilent_V4_Custom.bed
- Whole exome data (NextSeq 500) sequenced in CHEO diagnostic laboratory
- CHEO_NA12878_WES_S1dataset.vcf.gz
- CHEO_NA12878_WES_S1dataset.vcf.gz.tbi
- Agilent_CRE_v2.bed
- ARUP whole exome sequencing data (HiSeq 2000) publically available from NCBI GeT-RM Browser
- Genomic coordinates (BED) of OMIM genes for which a molecular basis of the associated disease is known (as of September 2019) :
- Omim_Genes.bed
Files
Files
(432.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:4427cc9e6411f38d79b57f7bc4a769b4
|
51.2 MB | Download |
|
md5:9f5b4cf0bc7fedf52d43b7ce91cc1bf7
|
4.7 MB | Download |
|
md5:abbddd37a52abd99eb21cc358a5107f5
|
208.7 MB | Download |
|
md5:2549bdb7bffd64490eeca80f91b85ad4
|
1.5 MB | Download |
|
md5:d58475a0ab622c14ab170eb5401d01b6
|
1.1 MB | Download |
|
md5:bd145f43fb1de7aa5a85acebb6f044ee
|
144.5 kB | Download |
|
md5:1e5121e446f97957de52a576ed95a7f0
|
3.6 MB | Download |
|
md5:e6e58089605a6fc95b005d62e9a5ec1c
|
283.6 kB | Download |
|
md5:b5447252fb60bdd1ea40b20c74136705
|
139.9 MB | Download |
|
md5:6c0dffc6f46ba5b5b598bab9706c3a70
|
1.6 MB | Download |
|
md5:d0c71cf4240e2c5bf111a26c3f741577
|
14.3 MB | Download |
|
md5:aa52a98bdcf98dce38bafe9e211b5b86
|
128.7 kB | Download |
|
md5:1de4675ac16d4b498154fa501b037d6e
|
5.1 MB | Download |
Additional details
References
- Pranckeviciene E, Potter R, Huang L, Jarinova O. Validation of bcbio-nextgen Pipeline Based on NextSeq500 Exome Sequencing. In 2019 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI) 2019 May 19 (pp. 1-6). IEEE.