There is a newer version of the record available.

Published September 24, 2025 | Version v2
Dataset Open

Test data for nf-updhmm nextflow pipeline

Description

nf-UPDhmm Test Data and Reference Files

The dataset provided for the nf-UPDhmm pipeline includes both test VCF files and reference BED files required for preprocessing and execution.

1. Test data – VCF files

VCF files from 1000 Genomes Project. These samples were extracted from the publicly available phased SNV/INDEL VCFs release (https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000G_2504_high_coverage/).

  • mother.vcf.gz: HG00404

  • father.vcf.gz: HG00403

  • proband_control.vcf.gz: HG00405

  • proband_heterodisomy.vcf.gz: HG00405 with a simulated heterodisomy event introduced.

Note: To ensure computational efficiency, all VCFs are restricted to this region (chr21:29222885-34430153). This allows the pipeline test to run quickly while preserving the structure of a real trio dataset.

2. Reference files – BEDs

In addition to the test VCFs, we provide BED files that define regions excluded during preprocessing. These are reference files, not test data, but they are required for the correct execution of the nf-UPDhmm pipeline.

The files are organized by reference genome version:

-prefix "hg19"

-prefix "hg38"

  1. centromeres.bed

  2. segmental_duplications.bed

  3. hla_kir.bed

    • Highly polymorphic immune-related loci.

    • Coordinates used:

      • hg19:

        • HLA: chr6:28,477,797–33,448,354

        • KIR: chr19:55,228,188–55,383,188

      • hg38:

        • HLA: chr6:28,510,120–33,480,577

        • KIR: chr19:54,025,634–55,084,318

  4. excluded_regions.bed

    • A combined file merging all of the above (centromeres, segmental duplications, HLA, and KIR).

Files

samplesheet.csv

Files (128.1 MB)

Name Size Download all
md5:9eb679310c98430225d743935ab7e6d1
32.1 MB Download
md5:f32d75d907ac90e897101b2fa65f1a7e
1.1 kB Download
md5:3829afefc3fa5f964765f05ca322d666
169.6 kB Download
md5:19497116a813fbea1a907684494de201
47 Bytes Download
md5:01a7f8e11f1f67e1d087c43a0b1d8a1c
1.2 MB Download
md5:eb73704a0b0ca8a6a7b53421337ddb3f
2.6 kB Download
md5:e4232c57a577b03887b63ec75896711f
176.0 kB Download
md5:e72f46fc085b6061dcfbe05ac7203ac3
47 Bytes Download
md5:97148a2e8ea05a713b8d46b538d232f6
1.7 MB Download
md5:6fafff8a19ff31517300e71c30ce51a2
32.1 MB Download
md5:040728187cbbbea42e9f6342571b9f24
32.1 MB Download
md5:08a7b5017aa699b30397ba04266be482
28.5 MB Download
md5:f2992355faca5c26433e38b601a5a64d
561 Bytes Preview Download