Published March 18, 2026 | Version 0.1.0
Preprint Open

Theoretical Foundations of Latent Posterior Factors: Formal Guarantees for Multi-Evidence Reasoning

  • 1. Epalea

Description

We present a complete theoretical characterization of Latent Posterior Factors (LPF), a principled framework for aggregating multiple heterogeneous evidence items in probabilistic prediction tasks. Multi-evidence reasoning—where a prediction must be formed from several noisy, potentially contradictory sources—arises pervasively in high-stakes domains including healthcare diagnosis, financial risk assessment, legal case analysis, and regulatory compliance. Yet existing approaches either lack formal guarantees or fail to handle multi-evidence scenarios architecturally. LPF addresses this gap by encoding each evidence item into a Gaussian latent posterior via a variational autoencoder, converting posteriors to soft factors through Monte Carlo marginalization, and aggregating factors via either exact Sum-Product Network inference (LPF-SPN) or a learned neural aggregator (LPF-Learned).

We prove seven formal guarantees spanning the key desiderata for trustworthy AI. Theorem 1 (Calibration Preservation) establishes that LPF-SPN preserves individual evidence calibration under aggregation, with Expected Calibration Error bounded as ECE ≤ ε + C/√K_eff. Theorem 2 (Monte Carlo Error) shows that factor approximation error decays as O(1/√M), verified across five sample sizes. Theorem 3 (Generalization) provides a non-vacuous PAC-Bayes bound for the learned aggregator, achieving a train-test gap of 0.0085 against a bound of 0.228 at N=4200. Theorem 4 (Information-Theoretic Optimality) demonstrates that LPF-SPN operates within 1.12× of the information-theoretic lower bound on calibration error. Theorem 5 (Robustness) proves graceful degradation as O(εδ√K) under evidence corruption, maintaining 88% performance even when half of all evidence is adversarially replaced. Theorem 6 (Sample Complexity) establishes O(1/√K) calibration decay with evidence count, with empirical fit R² = 0.849. Theorem 7 (Uncertainty Decomposition) proves exact separation of epistemic from aleatoric uncertainty with decomposition error below 0.002%, enabling statistically rigorous confidence reporting.

All theorems are empirically validated on controlled datasets spanning up to 4,200 training examples and eight evaluation domains. Companion empirical results demonstrate mean accuracy of 99.3% and ECE of 1.5% across eight diverse domains, with consistent improvements over neural baselines, uncertainty quantification methods, and large language models. Our theoretical framework establishes LPF as a foundation for trustworthy multi-evidence AI in safety-critical applications.

Files

main.pdf

Files (846.3 kB)

Name Size Download all
md5:3fb368607448fc74cc88780fa45734db
846.3 kB Preview Download

Additional details

Related works

Is identical to
Preprint: arXiv:2603.15674 (arXiv)
Is supplement to
Preprint: arXiv:2603.15670 (arXiv)
Is supplemented by
Preprint: 10.5281/zenodo.19183861 (DOI)

Software

Repository URL
https://github.com/aaaEpalea/epalea.git
Programming language
Python
Development Status
Active