Published March 2, 2026 | Version 2026-01-23
Dataset Open

uva_eskape_2026-01-23

  • 1. ROR icon University of Virginia

Description

Pathogen RefSeq chromosome database for UVA SNV Pipeline (ESKAPE profile)

This record contains a pre-built reference database used by the UVA SNV Pipeline for rapid speciation / reference selection and downstream SNV calling workflows. The database is derived from RefSeq chromosome FASTA files and is packaged to support reproducible execution on HPC environments.

Intended use

  • Used by the UVA SNV Pipeline to select appropriate reference genomes via Mash distance (ESKAPE-focused reference set).

  • Not intended as an authoritative taxonomic resource; it is an operational database optimized for reference selection in genomic surveillance pipelines.

Build notes

  • Mash parameters: k=21, s=50000

  • The database bundle includes build manifests and checksums to support reproducibility and integrity checks after download.

How to download and use
This database is downloaded by the pipeline using a helper script (provided in the GitHub repository) that retrieves this Zenodo artifact, verifies its checksum, and installs it into the expected directory structure.

Citation
If you use this database in analysis or reports, please cite this Zenodo record (DOI provided by Zenodo upon publication) and the UVA SNV Pipeline repository/release used.

Files

Files (23.8 GB)

Name Size Download all
md5:778168953539ae7a7fd68b45ae797073
23.8 GB Download