There is a newer version of the record available.

Published February 10, 2026 | Version v1
Dataset Open

HERVarium self-contained local distribution for local interactive exploration of human endogenous retroviruses

  • 1. CNAG
  • 2. Neurometabolic Diseases Laboratory, Bellvitge Biomedical Research Institute

Description

šŸ“¦ HERVarium – self-contained local distribution

This Zenodo record provides a self-contained, runnable distribution of HERVarium, an interactive Dash-based web application for exploring a genome-wide atlas of human endogenous retroviruses (HERVs).

The archive contains all files required to run HERVarium locally, including the application code, precomputed annotation assets, reference files, and a reproducible conda environment specification. Users can run the application by downloading and unpacking this archive, creating the conda environment, and launching the app — no additional data downloads or manual assembly are required.

This distribution is intended to support reproducibility, peer review, and long-term accessibility during the preprint and initial release phase.

🧬 Overview of HERVarium

HERVarium enables interactive, locus-level exploration of multiple HERV annotation layers through an embedded IGV genome browser and searchable data tables, including:

  • Internal HERV loci and protein-coding domain annotations, with domain class and HMM coverage summaries (GyDB / DFAM)

  • LTR regulatory annotations, including transcription factor binding motif (TFBM) matches

  • LTR structural architecture, including U3/R/U5 segmentation, PBS and PPT predictions, and promoter / polyadenylation signal annotations

  • Integrated genome browser (IGV) with locally served annotation tracks and optional external tracks

All datasets are precomputed and optimized for fast local interaction using Parquet, JSON, BED, and bigBed formats.

šŸ“ What is included in this record

This Zenodo archive (tar.xz) contains:

Application

  • HERVarium Dash application (app.py)

  • Reproducible conda environment specification (environment.yml)

Precomputed annotation data

  • Aggregated and query-ready tables (.parquet, .json)

  • Metadata files describing domains, LTRs, and U3/R/U5 features

Genome and annotation tracks

  • Genome reference files required by the embedded IGV viewer (FASTA + index)

  • Locally hosted BED / bigBed tracks for:

    • Internal HERV regions

    • Protein-coding domains

    • LTR elements

    • U3/R/U5 segments

    • PBS / PPT

    • Promoter and poly(A) signal annotations

    • LTR TFBMs

Static web assets

  • Stylesheets

  • Icons and logos used by the web interface

The directory structure inside the archive matches the layout expected by the application, enabling immediate execution after extraction.

ā–¶ļø How to run HERVarium locally (summary)

  1. Download and extract this archive:

    tar -xvf hervarium.tar.xz cd HERVarium
  2. Create and activate the conda environment:

    conda env create -f environment.yml conda activate hervarium
  3. Launch the application:

    python app.py
  4. Open a web browser at:

    http://127.0.0.1:8050

The application runs as a local web server and does not require an internet connection for core functionality.

šŸ§‘‍šŸ”¬ Intended use

This record is provided to:

  • Enable transparent peer review during the preprint stage

  • Ensure long-term, citable access to the exact data and assets used by HERVarium

  • Allow readers to interactively explore HERV annotation layers locally

  • Support reproducibility of figures and analyses presented in the associated manuscript

šŸ”— Related resources

The datasets integrated into HERVarium are also available as standalone Zenodo records:

The development repository for HERVarium is hosted on GitHub:

šŸ“– Citation

If you use HERVarium or this distribution in your work, please cite:

  • This Zenodo record

  • The HERVarium GitHub repository

  • The associated preprint / manuscript describing the resource: 

Regulatory Features and Functional Specialization of Human Endogenous Retroviral LTRs: A Genome-Wide Annotation and Analysis via HERVarium
Tomàs Montserrat-AyusoAurora PujolAnna Esteve-Codina
bioRxiv 2026.02.17.706328; doi: https://doi.org/10.64898/2026.02.17.706328

Files

Files (1.4 GB)

Name Size Download all
md5:94a7ed040e62b1a9c1426b28aa930672
1.4 GB Download

Additional details

Funding

Agència de Gestió d'Ajuts Universitaris i de Recerca
AGAUR-FI Joan Oró 2025 FI-1 00642

Software