Published April 30, 2026 | Version v2
Publication Open

A Living Meta-Analysis Architecture for Active Inference: Assertion Extraction, Nanopublications, and Hypothesis Scoring

  • 1. Active Inference Institute
  • 2. Massachusetts Institute of Technology (MIT)
  • 3. California Institute for Machine Consciousness (CIMC)

Description

No prior automated system tracks hypothesis-level evidence across the full Active Inference and Free Energy Principle (FEP) literature. Manual synthesis cannot keep pace with a field that has grown at a compound annual rate of 20.36% across 2005–2026, and the FEP’s theoretical generality has invited falsifiability critiques that only hypothesis-specific evidence profiling can address. Building on pioneering systematic manual annotation paired with ontology-based anal- ysis at the scale of hundreds of papers, we present a computational meta-analysis framework that automates and scales this approach. The pipeline retrieves literature from arXiv, Semantic Scholar, and OpenAlex, deduplicating 𝑁 = 819 papers via a canonical identifier hierarchy (DOI > arXiv ID > Semantic Scholar ID > OpenAlex ID). It classifies papers into a three-tier taxonomy spanning eight categories: A (Core Theory), B (Tools & Translation), and C (Application Domains). An LLM-powered extraction system then evaluates each abstract against eight core hypotheses, producing structured nanopublications—each encoding directionality, a confidence score, and natural-language reasoning—that populate an RDF-compatible knowledge graph scored by a citation-weighted evidence function. All extracted assertions are automatically generated and have not been manually validated; hypothesis scores should be considered preliminary.

The resulting evidence landscape reveals a field where application domains (Domain C, 64.0%) collectively dominate the corpus, with tools development (Domain B, 20.8%)—including pymdp, RxInfer.jl, and interpretable alternatives such as Free Energy Projective Simulation—and core theory (Domain A, 15.2%) rounding out the taxonomy. Non-negative matrix factorization identifies 5 latent topics that cross-cut the keyword taxonomy, and citation network analysis exposes a sparse yet structured graph (2,176 intra-corpus edges out of 29,323 total outgoing references—only 7.4% reference resolution, reflecting the corpus’s specialised scope rather than the underlying citation density of any single paper) anchored by pronounced hub papers. Hypothesis scores cluster into three tiers: a broad consensus tier (score > 0.83) covering five hypotheses—H7 Morphogenesis, H2 AIF Optimality, H4 Predictive Coding, H6 Clinical Utility, and H5 Scalability; a near-consensus boundary (H8 Language AIF, score ≈+0.83); a moderate debate tier (H3 Markov Blanket Realism, ≈+0.78); and a diffuse tier (H1 FEP Universality, ≈+0.48) where a large neutral plurality reflects the principle’s broad invocation without explicit empirical test—though absolute score magnitudes are inflated by publication bias and linguistic asymmetry in academic writing, making relative rankings and temporal trajectories more reliable than point estimates. By demonstrating that automated LLM-driven assertion extraction—operating without human-validated ground truth—can generate scalable, queryable representations of scientific evidence, this work provides a reusable architecture for living literature reviews—continuously updated knowledge graphs that track hypothesis-level consensus across rapidly evolving fields.

All code, results, and methods to reproduce this manuscript are open source at https://github.com/ActiveInferenceInstitute/act_inf_metaanalysis/

Files

act_inf_metaanalysis_v2_04-30-2026.pdf

Files (3.1 MB)

Name Size Download all
md5:b918dfa83ffe359f5c698e30d3c602b4
3.1 MB Preview Download

Additional details

Dates

Updated
2026-04-30
v2 published

Software

Repository URL
https://github.com/ActiveInferenceInstitute/act_inf_metaanalysis
Programming language
Python
Development Status
Active