SMaHT HapMap Mixture Benchmark Set
Authors/Creators
Contributors
Data manager:
Other:
Project leader (2):
Project manager:
Description
This directory contains highly accurate somatic benchmarking single nucleotide variants (SNVs), small indels (indels), structural variations (SVs), mobile element insertions (MEIs) and mitochondrial variants(mito) for the HapMap cell line mixture. The HapMap mixture consists of 83.5% of HG005 (Chinese male), 10% of HG02622 (European/Ashkenazi Jewish male), 2% of HG002 (African female), 2% of HG02257 (African male), 2% of HG02486 (East Asian female), and 0.5% HG00438 (Chinese male). The vcf.gz and bed files are intended to be used together to benchmark the performance of somatic variant callers. The benchmarking region is defined as the intersection of reliable regions across the 12 haploid assemblies from the 6 HapMap samples, encompassing 89% of GRCh38 and 90% of CHM13, respectively. Each benchmark set was produced from a minigraph-cactus graph constructed using: 1) The selected reference assembly (GRCh38, T2T-CHM13, HG005 maternal/paternal assemblies, or a personalized HapMap mixture assembly) and 2) Twelve haploid assemblies from the six HapMap samples. The following paper(s) can be cited for use of the benchmark variant set: Kong, Nahyun, et al. "A Pangenomic Method for Establishing a Somatic Variant Detection Resource in HapMap Mixtures." bioRxiv (2025): 2025-09.
Files
HapMapMixture_HG005.zip
Files
(1.1 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:d77ae5f5b0fada2c5a51b2cd6d52c3ac
|
575.5 MB | Preview Download |
|
md5:842bb97565b190d3915f38379a309738
|
504.7 MB | Preview Download |
Additional details
Related works
- Is described by
- Journal: 10.1101/2025.09.29.679336 (DOI)
Dates
- Updated
-
2025-11-06Missing files added to HG005 and personalized assembly sets