There is a newer version of the record available.

Published January 4, 2020 | Version v1
Dataset Open

NA12878 WES Benchmark dataset

  • 1. Vilnius University

Contributors

Project leader:

Project members:

  • 1. CHEO

Description

This dataset makes available the UCSC Genome Browser (genome.ucsc.edu) GRCh37 genome build public session NA12878 WES Benchmark files in a single dataset so that these files can be used in other applications or genome browsers such as IGV.  All  genomic variant calls in  all VCF files were decomposed and normalized with vt. This dataset contains: 

  1. Genome in a bottle (GIAB) version 3.3.2 high confidence (HC)  variant calls and genomic regions for HapMap individual NA12878 :
    1. GIAB_v3.3.2_NA12878-decomposed-normalized.vcf.gz
    2. GIAB_v3.3.2_NA12878-decomposed-normalized.vcf.gz.tbi
    3. GIAB_v3.3.2_NA12878_HC_regions.bed
  2. HapMap individual NA12878 WES variant calls (VCF) and capture regions (BED) from diagnostic laboratories :
    • ARUP whole exome sequencing data (HiSeq 2000) publically available from NCBI GeT-RM Browser
      1. converted_ARUP_NA12878_Exome-decomposed-normalized.vcf.gz
      2. converted_ARUP_NA12878_Exome-decomposed-normalized.vcf.gz.tbi
      3.  ARUP_SeqCap_EZ_Exome.bed
    • UCSF whole exome sequencing data (HiSeq 2500) publically available from NCBI GeT-RM Browser
      1. converted_UCSF_NA12878_WES_Agilent_V4_Custom-decomposed-normalized.vcf.gz
      2. converted_UCSF_NA12878_WES_Agilent_V4_Custom-decomposed-normalized.vcf.gz.tbi
      3. UCSF_WES_Agilent_V4_Custom.bed
    • Whole exome data (NextSeq 500) sequenced in CHEO diagnostic laboratory
      1. CHEO_NA12878_WES_S1dataset.vcf.gz
      2. CHEO_NA12878_WES_S1dataset.vcf.gz.tbi
      3. Agilent_CRE_v2.bed
  3. Genomic coordinates (BED) of OMIM genes for which a molecular basis of the associated disease is known (as of September 2019) :
    • Omim_Genes.bed 

 

Files

Files (443.6 MB)

Name Size Download all
md5:4427cc9e6411f38d79b57f7bc4a769b4
51.2 MB Download
md5:9f5b4cf0bc7fedf52d43b7ce91cc1bf7
4.7 MB Download
md5:cf6d282d315e894551bbd1e8d4cf116a
219.9 MB Download
md5:6d4e0d4517e1dbb9297217e09bea56be
1.5 MB Download
md5:13166a0da8fad50dd6cccf9e5f4acfc1
1.1 MB Download
md5:7f6e7658c1ce85e1c1d0c7230de45f2d
145.9 kB Download
md5:1e5121e446f97957de52a576ed95a7f0
3.6 MB Download
md5:e6e58089605a6fc95b005d62e9a5ec1c
283.6 kB Download
md5:b487697e5b99638da405980af36fed0a
140.2 MB Download
md5:705fa2354f2c2351dc75c810d85b4d80
1.6 MB Download
md5:d0c71cf4240e2c5bf111a26c3f741577
14.3 MB Download
md5:aa52a98bdcf98dce38bafe9e211b5b86
128.7 kB Download
md5:1de4675ac16d4b498154fa501b037d6e
5.1 MB Download

Additional details

References

  • Pranckeviciene E, Potter R, Huang L, Jarinova O. Validation of bcbio-nextgen Pipeline Based on NextSeq500 Exome Sequencing. In 2019 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI) 2019 May 19 (pp. 1-6). IEEE.