Published January 21, 2026 | Version v1
Preprint Open

The .causal Format: Deterministic Inference for AI-Assisted Hypothesis Amplification

Description

Background: Papers I–IV of the Sovereign Discovery Series generated 22 novel Long COVID hypotheses from
5,084 extracted causal triplets across 376 papers. However, traditional relational database storage (SQLite) limits
discovery to explicitly extracted facts, missing convergence patterns hidden below the detection threshold.

Innovation: We present the .causal binary format—a MessagePack + zlib compressed knowledge graph with
embedded deterministic inference rules. The format achieves 72% storage reduction while amplifying fact counts by
1.90x through three-pass transitive inference: exact keyword matching, semantic direction propagation, and Jaro-
Winkler fuzzy entity resolution.

Key Finding: Weak signals invisible in SQLite (3 triplets) become detectable convergence points in .causal (21+
triplets). This amplification revealed three new hypothesis candidates—Vagus Nerve Convergence (7.0x), POTS/
Autonomic Axis (7.7x), and Autonomic-Cardiovascular Link (30x)—that were previously below the pattern recognition
threshold.

Significance: The .causal format does not replace AI-assisted hypothesis generation—it enhances it. By pre-
computing transitive chains deterministically (zero hallucination risk), the format provides Claude with amplified
signal patterns, enabling discovery of convergence hubs that would otherwise remain hidden in noise.

Files

Foss_2026_Causal_Format_22_Hypotheses_Amplification_v1.2.pdf

Files (17.1 MB)

Name Size Download all
md5:2db25196ce75c4a0a457b84bcec32759
373.7 kB Preview Download
md5:a7119101c694a217777146564fcd9da4
13.2 MB Download
md5:a089e1c25b02fb1918f21299a0c38ced
3.5 MB Download

Additional details

Related works

Is part of
Preprint: 10.2139/ssrn.6060955 (DOI)

References