There is a newer version of the record available.

Published November 21, 2025 | Version v3
Preprint Open

HERALD: High-resolution Early Recognition of Antigenic Landscape Divergence

Description

HERALD: High-resolution Early Recognition of Antigenic Landscape Divergence

A theoretical framework for geometry-based viral surveillance that enables early detection of immune-escape variants before widespread transmission. HERALD constructs a Riemannian pullback manifold where Euclidean distances in latent space approximate antigenic relationships within bounded-distortion regimes.

This deposit includes:

  • Empirical validation studies on SARS-CoV-2 data (DMS and genomic surveillance)

Key Contributions:

  • Manifold Construction: Defines a contrastive learning objective with Jacobian/Laplacian regularization that induces smooth pullback metrics, enabling Euclidean computations to approximate geodesic distances reflecting antigenic divergence

  • Real-Time Drift Detection: Specifies a probability-integral transform (PIT) fusion scheme combining sequence, antigenic, and structural signals into a scalar drift statistic with O(log n) amortized complexity

  • Formal Guarantees: Derives margin-to-separation results for InfoNCE objectives and establishes Cantelli-based probability bounds requiring only finite variance (no sub-Gaussian assumptions), composing into conditional end-to-end dominance bounds with explicit error budgets

  • Evaluation Protocols: Provides falsifiable protocols for retrospective time-slice replay, prospective streaming emulation, distortion audits, and equity/parity analysis across pathogens (SARS-CoV-2, influenza, HIV)

  • Ethics Framework: Includes comprehensive governance templates addressing dual-use risks, information hazards, data sovereignty, abstention policies, and oversight structures

  • Empirical Validation: Validation I demonstrates 5.7–8.8× improvement in geometric separation (Δ) over baseline methods on SARS-CoV-2 deep mutational scanning data. Validation II confirms out-of-time generalization: the frozen encoder detects Omicron BA.1 emergence in South Africa (Z = 2.29, p ≈ 0.011) under sparse monthly surveillance data with near-zero support-adjusted latency.

Scope: The theoretical manuscript presents definitions, assumptions, theorems, and evaluation protocols. Validation I provides empirical confirmation on historical DMS data; Validation II demonstrates real-world surveillance applicability via retrospective time-slice replay on Nextstrain genomic data. Both studies use rigorous train/test methodology and no sequence optimization (safe surveillance scope). Applications extend beyond viral surveillance to bacterial pathogen monitoring (STEC, antibiotic resistance) where antigenic/functional divergence precedes clinical detection.

Author: Bee Rosa Davis (NASA Mission Systems Engineer, IBM X-Force Red Principal Adversarial Intelligence Engineer)

Keywords: viral surveillance, Riemannian geometry, contrastive learning, immune escape, early warning systems, algorithmic epidemiology, information geometry, public health AI, ESM-2, protein language models, deep mutational scanning

License and Patent Disclaimer

Copyright License: This work is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0). You are free to share and adapt this work for non-commercial purposes with appropriate attribution, provided derivative works use the same license.

Patent Notice: The methods, systems, and algorithms described herein are subject to U.S. Provisional Patent Application No. 63/919,595 (filed November 18, 2025). This copyright license does NOT grant any rights under patent law. Implementation, commercial use, or deployment of the HERALD framework may require separate patent licensing arrangements.

Clarification: The CC BY-NC-SA 4.0 license governs only the manuscript text and documentation—the right to read, cite, and build upon these ideas academically. The patent covers the technical implementation of the framework. For research and educational use, no patent license is required. For commercial deployment or production systems, contact the inventor regarding patent licensing.

Contact for Patent Licensing: bee_davis@alumni.brown.edu

Version 3.0: Added Validation Study II ("Time Traveler") demonstrating out-of-time generalization—frozen geometry trained on 2020 DMS data detects 2022 Omicron emergence with p ≈ 0.011 under sparse surveillance conditions. Includes time-slice replay code, heatmaps, and frozen encoder checkpoint.

Files

HERALD__High_resolution_Early_Recognition_of_Antigenic_Landscape_Divergence_FINAL.pdf