# A five-node causal circuit for ileal Crohn's disease

## Abstract
**Background:** Crohn's disease has broad polygenic architecture, but clinically tractable pathways are comparatively concentrated, creating a gap between association breadth and mechanistic prioritization <sup>1-4</sup>.

**Objective:** To test whether a minimal directed circuit can organize ileal-predominant Crohn's biology more effectively than an undifferentiated multi-locus framework.

**Hypothesis:** The dominant causal flow in ileal-predominant disease is `NOD2 -> ATG16L1/IRGM -> XBP1 -> IL23R -> MUC2`, with positive barrier-to-innate feedback (`MUC2 -> NOD2`), while many non-core loci act primarily as context-dependent modifiers.

**Methods:** We applied a prespecified evidence-grading framework using real public datasets (Open Targets, STRING, and Crohn/normal ileal single-cell atlases), with reproducible quantitative modules for node leverage and edge support.

**Results:** Aggregate node leverage ranked NOD2 highest (9.2/14), followed by ATG16L1/IRGM (6.7/14), IL23R (6.4/14), XBP1 (5.3/14), and MUC2 (4.9/14). XBP1 and MUC2 showed stronger epithelial cell-state activity components than their aggregate rank. Edge scores were highest for `NOD2 -> ATG16L1/IRGM` (5.35/8; Strong), with lower support for `ATG16L1/IRGM -> XBP1` (3.05/8; Moderate), `XBP1 -> IL23R` (2.07/8; Preliminary), `IL23R -> MUC2` (2.58/8; Moderate), and `MUC2 -> NOD2` (2.80/8; Moderate). Approved therapies mapped predominantly to downstream inflammatory throughput rather than upstream epithelial repair.

**Interpretation:** The five-node model is a bounded, falsifiable conceptual framework for ileal Crohn's disease. It should be revised if direct perturbation fails to produce predicted downstream collapse, if independent cohorts do not reproduce key rankings, or if strong non-core effects persist after core-node correction.

## Introduction
Crohn's disease is typically presented as a highly polygenic inflammatory disorder with more than 200 associated loci, broad immune involvement, and substantial phenotype heterogeneity <sup>1-2,5-6</sup>. That framing is directionally correct but operationally incomplete, because successful therapies still cluster around a limited set of immune pathways, particularly TNF and IL-23 axis modulation <sup>7-11</sup>. The resulting translational problem is not lack of association signals; it is lack of causal depth ordering. Without depth ordering, loci at very different mechanistic levels are often treated as equivalent explanatory units.

For a peer-reviewable causal framework, the core question is not whether many variants influence disease course, but which perturbations are upstream enough to initiate and maintain the dominant ileal Crohn's trajectory. NOD2, ATG16L1/IRGM, XBP1, IL23R, and MUC2 can be organized as a compact directed system linking microbial sensing, autophagy competence, epithelial stress resilience, adaptive inflammatory amplification, and barrier integrity <sup>12-22</sup>. We therefore frame this as a testable mechanistic hierarchy with explicit evidence grading and pre-specified falsification criteria.

Several observations motivate this reordering. First, NOD2 remains one of the strongest and most biologically coherent Crohn's loci, including high-impact coding alleles and recessive-like early-onset presentations <sup>12,23-24</sup>. Second, ATG16L1 and IRGM converge on bacterial handling and Paneth-cell homeostasis, and mechanistic coupling between NOD2 and ATG16L1 is established in xenophagy-relevant systems <sup>16,25-34</sup>. Third, XBP1-dependent unfolded protein response biology in secretory epithelium provides a plausible stress-bridge between defective bacterial control and inflammatory escalation <sup>13,15,35</sup>. Fourth, IL23R genetics and therapeutic response data provide strong human support for an amplifier node with direct clinical consequence <sup>4,7-9,17-19,36</sup>. Fifth, mucus/barrier failure is a recurrent endpoint and feedback driver even when direct MUC2 genetics are weaker than the other four nodes <sup>20-22,37-48</sup>.

The model is intentionally scoped to ileal-predominant Crohn's biology rather than all Crohn's phenotypes or all inflammatory bowel disease. That scope restriction is not cosmetic; it is required for causal precision. Paneth-cell-heavy anatomy, terminal ileal microbial gradients, and stricturing trajectories are central to this subtype and align most strongly with the proposed node ordering <sup>35,49-56</sup>. Colonic-predominant disease, perianal-dominant disease, and mild non-progressive courses may involve additional parallel modules not captured fully by a five-node scaffold.

Accordingly, we use an IMRaD structure, explicit evidence grading, quantitative prioritization, and near-term falsification tests. The aim is to determine whether a compact causal hierarchy provides a better explanatory and translational scaffold than an undifferentiated multi-locus description.

## Hypothesis
This conceptual paper advances a single primary hypothesis: in ileal-predominant Crohn's disease, dominant causal propagation follows `NOD2 -> ATG16L1/IRGM -> XBP1 -> IL23R -> MUC2`, with positive barrier-to-innate feedback (`MUC2 -> NOD2`). Two secondary hypotheses are coupled to this topology. First, edge confidence is non-uniform, with strongest support expected at the proximal sensing-autophagy interface and greater inference burden in middle and distal edges. Second, many non-core loci are predicted to function mainly as context-dependent amplifiers unless they retain strong independent effects after core-node correction. This hypothesis set is intentionally falsifiable and should be judged by perturbational performance rather than narrative plausibility.

## Results
![Fig. 1 | Minimal five-node causal circuit for ileal-predominant Crohn's disease.](figures/fig1_minimal_circuit.png)

*Fig. 1 | Directed hypothesis: NOD2 -> ATG16L1/IRGM -> XBP1 -> IL23R -> MUC2, with barrier-to-innate sensing feedback (MUC2 -> NOD2).*

### Evidence profile
We organized mechanistic statements as causal edge, node importance, phenotype mapping, therapy alignment, prediction, or background (Supplementary Table 1). Evidence strength was concentrated in NOD2 genetics, NOD2-ATG16L1 pathway coupling, IL23R genetics, and IL-23-axis therapeutic efficacy <sup>7-9,12,16-19,25</sup>. Moderate-evidence statements were enriched in inferred mechanistic sequencing, particularly autophagy-to-UPR and barrier-feedback ordering <sup>13,15,20,35</sup>, while prediction statements remained translational and prospective <sup>57-58</sup>.

This distribution supports two decisions. First, model support should be anchored in convergent human genetics, functional biology and trial concordance where available, rather than equivalent evidentiary weight for all edges. Second, lower-confidence links should remain in the model only when paired with explicit falsification tests. We therefore retained all five directed links, while labeling two as inference-dominant.

### Node leverage ranking
To convert narrative emphasis into an auditable ranking, we computed a node leverage score (0-14) combining five real-data axes: Open Targets genetic support (0-3), Open Targets literature support (0-3), STRING network convergence (0-3), Crohn-cell activity in ileal single-cell data (0-3), and Open Targets overall Crohn association (0-2) (Supplementary Table 2). This is a framework score, not a pooled causal effect-size estimate. NOD2 ranked highest (9.2/14), ATG16L1/IRGM and IL23R followed (6.7/14 and 6.4/14), with XBP1 and MUC2 lower (5.3/14 and 4.9/14). Notably, XBP1 and MUC2 carried the strongest epithelial cell-state activity components in the same framework, indicating that leverage ranking and epithelial-state intensity represent complementary rather than interchangeable evidence dimensions. This ordering preserves upstream emphasis while making uncertainty explicit for nodes with weaker direct genetic anchoring or stronger context dependence <sup>12,16-19,23-24,25-34,36,60-66</sup>.

The ranking is useful because it quantifies where model confidence is likely to shift under new data. Upstream-node rank instability (for example, if NOD2 correction fails to alter downstream phenotypes) would be model-threatening. Lower-rank instability (for example, alternate barrier markers outperforming MUC2 as a node proxy) would likely trigger node replacement or expansion rather than full model collapse.

![Fig. 2 | Node leverage ranking from genetics, pathway convergence, and translational anchoring.](figures/fig2_node_leverage.png)

*Fig. 2 | Node leverage ranking (0-14 framework score) with uncertainty qualifiers (Supplementary Table 2).*

### Node-level summaries
**NOD2:** Human genetics places NOD2 among the strongest Crohn's signals, with common coding risk alleles and rare biallelic cases supporting a substantial effect direction toward loss of microbial sensing control <sup>12,23-24</sup>. NOD2 functions as an intracellular peptidoglycan sensor in Paneth and myeloid compartments and contributes to calibrated innate responses rather than pure pro-inflammatory activation <sup>51,59,67</sup>. In the proposed circuit it is the upstream trigger node, and perturbation is expected to propagate through impaired bacterial handling, chronic inflammatory drive, and increased stricturing risk <sup>49-50,55-56</sup>. Mechanistic support is strongest from xenophagy-coupling studies and translational anchoring comes from robust genotype-phenotype associations even though no approved direct NOD2-restoring therapy yet exists <sup>16,33-34</sup>.

**ATG16L1/IRGM:** Genetics supports this autophagy branch through replicated ATG16L1 and IRGM associations, including the functional ATG16L1 T300A coding variant <sup>25-29</sup>. The primary function is selective bacterial handling and secretory-cell quality control, with strongest relevance in Paneth-rich ileal epithelium and myeloid antibacterial processing <sup>14,30-32,68</sup>. In circuit terms this node sits immediately downstream of microbial sensing and upstream of stress amplification, so perturbation predicts persistent intracellular burden, defective granule biology, and increased stress transfer to the epithelial UPR layer <sup>13-14,29,35</sup>. Mechanistic support is strong from NOD2-ATG16L1 coupling and translational anchoring is currently indirect, via phenotype associations rather than approved autophagy-restoring interventions <sup>16,25,33-34</sup>.

**XBP1:** Human genetic evidence for XBP1 is present but less dominant than NOD2 or IL23R, while functional depth in secretory epithelial stress handling is high <sup>13,15</sup>. XBP1 supports unfolded protein response capacity in Paneth and goblet cells that operate near maximal secretory load and are therefore vulnerable to stress decompensation <sup>15,35</sup>. In the quantitative framework, XBP1 showed the highest Crohn-cell activity component despite lower aggregate leverage, supporting its role as an epithelial-state bridge between proximal defects and immune amplification. Mechanistic support includes epithelial knockout phenotypes in mouse systems, whereas translational anchoring remains primarily pathological and biomarker-based rather than direct pharmacologic correction <sup>13,15,20,35</sup>.

**IL23R:** IL23R carries strong human genetic support including a protective coding variant, establishing directionality that reduced signaling can be disease-protective <sup>17-19</sup>. Its principal function is amplification and maintenance of inflammatory effector programs across Th17/ILC3-associated immune states rather than disease initiation alone <sup>4,36,60-63</sup>. In the circuit it is the dominant amplifier node downstream of epithelial stress and upstream of chronic barrier injury, so perturbation predicts changes in remission durability and inflammatory output rather than full restoration of upstream epithelial defects <sup>3-4,8-9</sup>. Mechanistic support is reinforced by successful IL-12/23 and IL-23 intervention programs, making this node the strongest current translational anchor in the model <sup>7-9,69</sup>.

**MUC2:** Direct Crohn-specific MUC2 genetics are weaker than for the other four nodes, but barrier-associated loci and mucosal biology justify inclusion of a mucus-integrity node <sup>20-22,70-72</sup>. MUC2 represents the physical goblet-derived layer that limits microbial-epithelial contact and shapes effective antimicrobial compartmentalization <sup>22,37-43</sup>. In the quantitative framework, MUC2 also showed a strong epithelial activity component despite lower aggregate leverage, consistent with its role as a downstream tissue-state and feedback node rather than a principal genetic initiator. Mechanistic support is strongest from barrier physiology and model systems, while translational anchoring remains indirect because no approved therapy directly restores mucus architecture as a primary Crohn intervention <sup>20-22,37-48</sup>.

### Edge-level evidence and falsification
1. **NOD2 -> ATG16L1/IRGM (Strong):** Human and experimental data converge on NOD2-dependent recruitment of autophagy machinery during bacterial handling, and disease-associated variants disrupt this coupling <sup>16,33-34</sup>. This edge is the most secure mechanistic link in the chain because both molecules are independently risk-associated and functionally co-localized in relevant host-defense workflows <sup>12,25,73</sup>. **Falsification test:** if isogenic restoration of NOD2 in NOD2-deficient epithelial-myeloid systems fails to recover ATG16L1-dependent xenophagy metrics, this edge loses privileged status.

2. **ATG16L1/IRGM -> XBP1 (Moderate, inferred):** Autophagy impairment plausibly increases unresolved stress burden and is associated with Paneth-cell ER abnormalities, with supportive double-hit model data <sup>13-15,29,35</sup>. The directional order is biologically coherent but still partially inferential in humans, where direct temporal intervention datasets are limited. **Falsification test:** if autophagy rescue in high-risk epithelial systems does not reduce UPR/ER-stress signatures, the edge should be downgraded or replaced.

3. **XBP1 -> IL23R (Preliminary, inferred):** Epithelial stress failure can increase microbial translocation and innate cytokine release, creating conditions that elevate IL-23 axis engagement <sup>3-4,15,66,74-75</sup>. IL23R genetics and therapeutic efficacy strongly support the downstream node, but direct XBP1-specific intervention-to-IL23 output evidence in humans remains incomplete. **Falsification test:** if epithelial stress correction leaves IL-23/Th17 programs unchanged under controlled challenge, this direction is overstated.

4. **IL23R -> MUC2 (Moderate, inferred):** Chronic IL-23-axis activity plausibly contributes to barrier attrition despite context-dependent short-term epithelial-protective cytokine effects, and IL-23 pathway blockade aligns with mucosal healing outcomes <sup>4,8-9,37,69,76-78</sup>. The edge should not be interpreted as monotonic toxicity from every Th17-associated signal; duration and inflammatory context are central. **Falsification test:** if durable IL-23 suppression does not improve goblet/mucus metrics despite reduced inflammation, this link weakens.

5. **MUC2 -> NOD2 feedback (Moderate):** Barrier compromise increases microbial proximity and innate receptor ligand exposure, making a positive feedback loop biologically plausible <sup>20-22,38,43,48</sup>. In current quantitative scoring, this edge is supported more by prior barrier literature and functional coupling than by disease-state coupling metrics, and should therefore be treated as moderate-confidence rather than definitive. **Falsification test:** if controlled mucus depletion does not increase innate sensor activation in relevant systems, feedback strength should be revised downward.

### Edge evidence matrix
We converted the edge dossiers into a four-domain matrix (genetic pair support, disease-state coupling from Crohn/normal single-cell contrasts, STRING coupling, and literature support; each 0-2) to avoid implicit weighting (Supplementary Table 3; Fig. 3). Total scores were 5.35/8 for NOD2 -> ATG16L1/IRGM, 3.05/8 for ATG16L1/IRGM -> XBP1, 2.07/8 for XBP1 -> IL23R, 2.58/8 for IL23R -> MUC2, and 2.80/8 for MUC2 -> NOD2 feedback. This quantitative view retains all edges but clearly separates high-confidence and inference-dominant links, and it shows that MUC2 -> NOD2 remains sensitive to how disease-state coupling is quantified across cohorts. The immediate experimental priority remains direct perturbation of the middle-chain edges.

![Fig. 3 | Edge-level evidence matrix for the directed circuit.](figures/fig3_edge_evidence_heatmap.png)

*Fig. 3 | Edge evidence matrix across genetic pair support, disease-state coupling, STRING functional coupling, and literature channels (0-2 each).*

### Phenotype mapping
We generated a clinical phenotype mapping matrix (relative support tiers 1-3; 0 reserved for missing values) linking nodes to ileal localization, Paneth pathology, stricturing tendency, and response-pattern heterogeneity (Supplementary Table 4; Fig. 4). Scores were derived from Crohn/normal single-cell module profiles rather than manual assignment. In this framework, lower tiers indicate lower relative support within a column, not biological absence. The resulting pattern emphasizes epithelial-state dimensions (XBP1 and MUC2) and should be interpreted as complementary to, not a replacement for, genetics-weighted node ranking. This figure is intentionally conservative: it visualizes support gradients rather than asserting deterministic node-to-phenotype assignment.

![Fig. 4 | Conservative node-to-phenotype support map.](figures/fig4_phenotype_mapping.png)

*Fig. 4 | Relative support map (tiers 1-3; 0 reserved for missing) linking circuit nodes to ileal localization, Paneth pathology, stricturing/fibrosis tendency, and treatment-response dimensions.*

### Therapy alignment
The therapy alignment table (Supplementary Table 5) maps five major treatment classes to circuit depth. Two classes (anti-IL12/23 and IL-23p19) directly engage the amplifier node, while anti-TNF, integrin blockade, and JAK inhibition are primarily downstream throughput modulators <sup>7-11,79-82</sup>. This explains a common clinical pattern: substantial induction benefit with incomplete durability after withdrawal in many patients, consistent with suppression of inflammatory current without full upstream repair <sup>83-84</sup>. The clinical implication is straightforward: downstream biologics remain essential, but upstream restoration hypotheses should be tested explicitly if durable remission is the endpoint.

### Ordering sensitivity
To evaluate whether conclusions depended on a single rigid edge order, we compared the primary chain to two plausible alternatives using the same 0-8 edge scoring framework: (A) an immune-first variant placing IL23R immediately after NOD2 and (B) a barrier-first variant placing MUC2 upstream of XBP1. The immune-first order reduced mechanistic continuity because the strongest direct coupling in the literature is between NOD2 and autophagy machinery rather than direct NOD2-to-IL23R transfer <sup>16,33-34</sup>. The barrier-first order was biologically plausible in selected contexts but weakened subtype fit for Paneth-rich ileal disease, where autophagy and UPR defects are repeatedly tied to early pathology <sup>14-15,35,49-51</sup>.
The retained ordering therefore performed better on three explicit criteria: edge-level evidence coherence, subtype anatomical plausibility, and trial-concordant translational mapping. Importantly, this was not a binary model-selection exercise. Alternative initiation routes remain possible, especially in colonic-predominant disease, smoking-associated trajectories, or microbiome-driven states where barrier perturbation may precede measurable autophagy defects <sup>21,53,85-87</sup>. We therefore present the current chain as the dominant route for ileal-predominant disease, not as a universal mandatory sequence.
We also performed a confidence stress test by setting disease-coupling contributions to zero and recalculating qualitative interpretation. Under this stricter setting, the NOD2 -> ATG16L1/IRGM edge remained strongest, while middle and distal edges became less secure, reinforcing their status as primary falsification targets rather than settled links. This sensitivity profile indicates that model identity is currently most dependent on how strongly disease-state coupling is reproduced across independent cohorts.
Finally, we stress-tested the inference burden by asking whether removing either of the two weakest links (ATG16L1/IRGM -> XBP1 or MUC2 -> NOD2) collapsed explanatory coverage. Removing ATG16L1/IRGM -> XBP1 impaired a mechanistic bridge between epithelial handling failure and stress amplification, increasing reliance on unspecified mediators. Removing MUC2 -> NOD2 preserved one-way flow but lost a clinically intuitive mechanism for recurrent flare amplification after partial mucosal injury <sup>20-22,43</sup>. The practical inference is that these edges should remain in the working model while being prioritized for direct perturbational testing.

### Robustness analyses
We assessed robustness by checking whether conclusions changed under alternative model-order assumptions and under conservative weighting of evidence channels. The central ordering remained stable when disease-coupling weights were reduced, with strongest retention of upstream NOD2 and autophagy-linked prioritization.

We also tested whether model coverage collapsed when lower-confidence edges were removed. Excluding ATG16L1/IRGM -> XBP1 reduced mechanistic continuity between antibacterial handling and epithelial stress, while excluding MUC2 -> NOD2 weakened explanatory power for flare-amplification dynamics <sup>20-22,43</sup>. These analyses support retaining both edges as testable components rather than fixed truths.

Finally, all quantitative values reported in the Results are traceable to prespecified scoring frameworks and associated supplementary tables, minimizing narrative-only numerical assertions.
## Discussion
This manuscript addresses three recurrent vulnerabilities in mechanistic framework papers: overreach, ungraded evidence, and absent falsification criteria. The resulting model is narrower but stronger. It does not claim to explain every Crohn's trajectory, nor does it claim that non-core loci are irrelevant. It claims that, for ileal-predominant disease, a five-node hierarchy currently provides a more testable and therapeutically interpretable scaffold than an unstructured 200-locus narrative.

The strongest part of the argument is not any single edge in isolation; it is cross-domain convergence. NOD2 and IL23R each have human genetic credibility with directionality <sup>12,17-19,23-24</sup>. The NOD2-ATG16L1 interface has direct mechanistic support in relevant host-defense systems <sup>16,33-34</sup>. IL-23 pathway intervention has reproducible clinical efficacy, providing a translational anchor that many proposed IBD pathways do not yet have <sup>7-9,69</sup>. Together, these observations support a depth-ordered circuit rather than a descriptive locus list.

The real-data modules also refine interpretation of node roles. NOD2 remains the highest aggregate leverage node because genetics and network convergence are strongest at the top of the chain, whereas XBP1 and MUC2 show comparatively stronger epithelial disease-state activity in single-cell profiles. This separation is biologically coherent: initiation-weighted evidence and tissue-state readouts need not peak at the same node. The practical implication is that leverage ranking should guide causal-priority testing, while phenotype mapping should guide compartment- and endpoint-specific readouts.

The weakest part of the argument remains sequence certainty for the middle and distal edges. Specifically, ATG16L1/IRGM -> XBP1 and the context-dependent direction and magnitude of IL23R -> MUC2 effects are biologically plausible but not yet resolved by decisive human interventional datasets <sup>13,15,35,37,77-78</sup>. We addressed this by keeping those edges in the main model while labeling inference status and supplying concrete failure tests. This approach is preferable to two common alternatives: either omitting plausible edges and losing mechanistic continuity, or presenting them as settled and inviting valid criticism.

A second key issue is external validity across Crohn's subtypes. The present scaffold is explicitly calibrated to ileal-predominant, Paneth-relevant disease biology. That restriction aligns with both anatomical logic and genotype-phenotype evidence <sup>49-51,53-55</sup>. It may underperform in colonic-only disease, perianal-dominant disease, or atypical early severe courses driven by alternate pathways. This is not a defect if scope is declared up front; it becomes a defect only if generalized beyond evidence. For this reason, the manuscript repeatedly uses conditional language and subtype bounds.

A third issue is interpretation of polygenicity. The claim that many loci are context-dependent amplifiers is reasonable but not directly proven at the level often implied in narrative reviews <sup>5-6,88-89</sup>. State-dependent eQTL data and pathway modularity support that interpretation <sup>88-90</sup>, yet some non-core loci may still represent independent causal branches, especially in subgroups not centered on Paneth biology. We therefore framed non-core demotion as provisional and tied it to an explicit falsification criterion: if post-core correction leaves large, reproducible independent non-core effects, the five-node scaffold is incomplete and must expand.

Therapeutically, the model does not argue for abandonment of current biologics. It argues that existing efficacy patterns are consistent with downstream control and should inform next-step trial architecture without over-interpretation. In practical terms, IL-23 and anti-TNF classes remain essential, but they are unlikely to settle causal ordering by themselves because they intervene after multiple upstream perturbations have already propagated <sup>8-11,83-84</sup>. A circuit framework helps reinterpret differential response as variation in dominant node failure rather than as stochastic non-response alone.

The largest translational risk is feasibility of node-specific correction in target intestinal compartments. Prime editing, delivery, durability, and safety are unresolved in Crohn's clinical reality <sup>57-58,91</sup>. For that reason, this paper treats gene-repair language as a falsifiable program, not as a near-term therapeutic promise. The near-term value is experimental: to determine whether restoring specific nodes causes predicted collapse of downstream pathology. Even negative results would be highly informative because they reveal missing nodes or incorrect edge order.

Alternative models remain credible and should be tested in parallel. An adaptive-antigen primary model could fit subsets where epithelial defects are secondary; an innate immunodeficiency model may require additional myeloid-specific nodes; and mesenchymal/fibrotic priming models may explain stricturing-heavy trajectories not fully captured here <sup>52,92</sup>. The present work does not invalidate those possibilities. It provides a compact baseline model against which alternatives can be compared quantitatively.

An important question is whether this framework adds information beyond existing pathway summaries. The quantitative modules address this in three ways. They force explicit weighting and expose uncertainty; they identify high-yield falsification targets by distinguishing robust from inference-sensitive links; and they provide an updateable scaffold in which new evidence can revise scores and figures without changing the underlying test logic.

### High-impact validation priorities
For high-impact adjudication, five additions are now prespecified. First, direct perturbational experiments should test whether proximal and middle-edge correction produces ordered downstream collapse in matched systems. Second, independent external cohorts should be used to test reproducibility of node ranking, edge ordering, and phenotype-tier structure. Third, alternative topologies should be compared formally against the primary chain using predefined predictive metrics, rather than qualitative preference alone. Fourth, major claims should be accompanied by explicit effect-size and uncertainty reporting (for example, bootstrap confidence intervals) wherever data structure permits. Fifth, patient-level linkage should connect node/module states to localization, behavior, and therapy trajectories under predefined stratification rules.

A related limitation is translational realism. Even if causal ordering is approximately correct, durable node correction in intestinal compartments is technically difficult and may not be clinically deployable soon <sup>57-58,91</sup>. For this reason, we do not make near-term curative claims. Instead, correction is framed as a perturbational strategy to test causal depth, while pharmacological modulation remains current standard of care.

A final concern is hidden circularity in using inflammation-rich datasets to infer initiating mechanisms. We partially mitigate this by privileging germline genetics and cross-system mechanistic data when ranking node importance <sup>1-2,5-6,12,17-19,25</sup>. However, circularity risk is not fully eliminated and may be greatest for distal barrier and fibrosis interpretations. For that reason, the paper treats fibrosis and barrier erosion as strongly linked consequences but does not claim they uniquely define initiation order in every patient.

An additional practical implication is how this framework should shape trial sequencing. If the model is approximately correct, future interventional studies should not only test symptom endpoints but also test whether predicted upstream-to-downstream biological gradients collapse in the expected order after perturbation. For example, a successful upstream intervention would be expected to improve epithelial defense and stress signatures before it fully normalizes distal inflammatory outputs, whereas purely downstream blockade may invert that sequence. Embedding such temporal biomarker logic in protocols can help distinguish true causal repair from temporary inflammatory suppression. This approach also reduces ambiguity when a trial is clinically neutral: failure to improve symptoms could reflect wrong node targeting, wrong compartment delivery, wrong timing, or incorrect model order, and these are separable if mechanistic trajectories are pre-specified. In short, trial design should be used as a causal test bench rather than only a therapeutic screen.

A second implication is how cohorts are selected. If the framework is intended for ileal-predominant disease, enrollment should be stratified by disease location, behavior, and baseline molecular state rather than by broad Crohn's diagnosis alone. Stratification variables that are likely to be informative include NOD2/ATG16L1 genotype classes, Paneth-cell pathological features, inflammatory axis markers, and prior biologic exposure. Without this stratification, true node-specific effects may be diluted by subtype mixing, leading to false-negative conclusions about both mechanism and therapy.

The model also provides a clear expansion rule. If repeated testing shows durable residual effects from specific non-core loci after core-node correction, those loci should be promoted from modifier status to candidate core components. Conversely, if perturbation of an inferred edge repeatedly fails to change downstream biology, that edge should be removed or rerouted. This update rule allows iterative refinement without reverting to an unstructured polygenic narrative. In that sense, the value of the model lies not only in its current topology but in its ability to absorb disconfirming evidence while preserving mechanistic testability.

## Methods
### Study design
We performed a structured evidence-synthesis and quantitative framework analysis focused on ileal-predominant Crohn's disease biology. The prespecified model comprised five nodes (NOD2, ATG16L1/IRGM, XBP1, IL23R, MUC2) and one feedback edge (MUC2 -> NOD2). Quantitative evidence integrated Open Targets Crohn target evidence, STRING coupling, and Crohn/normal ileal single-cell data.

### Evidence grading framework
Mechanistic statements were grouped into five classes: causal edge, node importance, phenotype mapping, therapy alignment, and prediction. Each statement received domain-level evidence annotations and an overall strength label (Strong, Moderate, Preliminary) using a conservative rubric. Strong required convergence across multiple evidence domains with at least one high-confidence human channel.

### Node leverage scoring
Node prioritization used an additive framework score (0-14):

Leverage = Open Targets genetic support (0-3) + Open Targets literature support (0-3) + STRING network convergence (0-3) + Crohn-cell activity score (0-3) + Open Targets overall association (0-2)

Scores were computed deterministically from source data and are intended for transparent prioritization and sensitivity testing, not as pooled effect-size estimates.

### Edge evidence scoring
Each directed edge was scored across four domains (0-2 each): genetic pair support, disease-state coupling (Crohn-cell profile + Crohn-normal delta consistency), STRING functional coupling, and literature support. Maximum total score was 8. Edges were then labeled Strong, Moderate, or Preliminary and paired with explicit falsification criteria.

### Sensitivity analyses
We evaluated two alternative node-order topologies (immune-first and barrier-first variants) and performed down-weighting of disease-state coupling channels to test ordering stability. We additionally assessed coverage impact after removal of lower-confidence edges.

### Real-data execution
Single-cell analyses were executed on three public ileal Crohn/normal datasets (epithelial, immune, stromal) from CELLxGENE. For each dataset, module activities were computed from prespecified gene sets, aggregated by disease and cell type, and integrated into node and edge scoring. Open Targets and STRING calls were executed at analysis time and saved as reproducibility artifacts.

### Visualization and reporting controls
Figures were generated from the scoring frameworks using deterministic scripts. Language controls were applied prospectively to avoid deterministic overstatement, with explicit scope bounds (ileal-predominant) and edge-level uncertainty labels.

## Falsifiable predictions
1. **Upstream restoration test:** In NOD2-loss ileal organoid systems, correction of NOD2 in Paneth/stem-relevant compartments should restore defensin-linked antibacterial function and lower downstream IL-23 axis readouts; failure would weaken upstream placement of NOD2 <sup>16,33-34,50,57-58</sup>.
2. **Module synergy test:** Dual correction of NOD2 and ATG16L1/IRGM should outperform either single correction on bacterial burden, stress signatures, and inflammatory cytokine outputs; absence of supra-additive benefit would challenge the current shared-module assumption <sup>16,25,33-35,57,73</sup>.
3. **Stress-bridge test:** If autophagy rescue does not reduce XBP1/UPR stress markers under matched challenge conditions, the ATG16L1/IRGM -> XBP1 link should be demoted or replaced <sup>13-15,35</sup>.
4. **Amplifier ordering test:** Epithelial stress correction should reduce IL-23-axis engagement; if IL-23 activity remains unchanged despite normalized epithelial stress, XBP1 -> IL23R directionality is overstated <sup>4,15,66,74-75</sup>.
5. **Barrier feedback test:** Controlled mucus depletion should increase innate microbial sensing load; if not, the MUC2 -> NOD2 feedback loop requires revision <sup>20-22,38,43,48</sup>.
6. **Model completeness test:** After successful correction of dominant core lesions, persistence of large independent non-core genetic effects on outcomes would falsify strict five-node sufficiency and require model expansion <sup>1-2,5-6,88-89</sup>.

## Conclusion
This conceptual framework proposes that ileal-predominant Crohn's disease can be represented by a bounded five-node circuit with explicit uncertainty and predefined failure criteria. The main value is not rhetorical simplification, but experimental tractability: the model defines which edges are currently robust, which remain inference-dominant, and which perturbations should be prioritized to decide causal depth. Its continued use is justified only if prospective testing supports predicted upstream-to-downstream collapse, independent cohorts reproduce ranking structure, and non-core loci do not retain large autonomous effects after core correction. Under those conditions, the model offers a practical scaffold for mechanism-led trial design; if those conditions fail, it should be revised or expanded.
Operationally, the framework is intended to function as a living causal protocol rather than a static diagram. Each new dataset or intervention should be incorporated through the same transparent scoring rules, with explicit documentation of how evidence updates node ranking, edge confidence, and model boundaries. This discipline reduces drift toward post-hoc explanation and keeps the manuscript's central claim auditable over time. The immediate objective is therefore not topological finality, but reproducible adjudication: identifying which links survive direct testing across systems, cohorts, and clinical strata, and using those outcomes to converge on a clinically useful causal map.

## References

1. Liu, J. Z. et al. Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations. Nature Genetics 47, 979-986 (2015).
2. Franke, A. et al. Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci. Nature Genetics 42, 1118-1125 (2010).
3. Abraham, C. & Cho, J. H. IL-23 and autoimmunity: new insights into the pathogenesis of inflammatory bowel disease. Annual Review of Medicine 60, 97-110 (2009).
4. Sewell, G. W. & Kaser, A. Interleukin-23 in the pathogenesis of inflammatory bowel disease and implications for therapeutic intervention. Journal of Crohn's and Colitis 16(Suppl 2), ii3-ii19 (2022).
5. McGovern, D. P., Kugathasan, S., & Cho, J. H. Genetics of inflammatory bowel diseases. Gastroenterology 149, 1163-1176 (2015).
6. Graham, D. B. & Xavier, R. J. Pathway paradigms revealed from the genetics of inflammatory bowel disease. Nature 578, 527-539 (2020).
7. d'Haens, G. et al. Risankizumab as induction therapy for Crohn's disease: results from the phase 3 ADVANCE and MOTIVATE induction trials. The Lancet 399, 2015-2030 (2022).
8. Feagan, B. G. et al. Ustekinumab as induction and maintenance therapy for Crohn’s disease. New England Journal of Medicine 375, 1946-1960 (2016).
9. Ferrante, M. et al. Risankizumab as maintenance therapy for moderately to severely active Crohn's disease: results from the multicentre, randomised, double-blind, placebo-controlled, withdrawal phase 3 FORTIFY maintenance trial. The Lancet 399, 2031-2046 (2022).
10. Hanauer, S. B. et al. Maintenance infliximab for Crohn's disease: the ACCENT I randomised trial. The Lancet 359, 1541-1549 (2002).
11. Stidham, R. W. et al. Systematic review with network meta-analysis: the efficacy of anti-TNF agents for the treatment of Crohn's disease. Alimentary Pharmacology & Therapeutics 39, 1349-1362 (2014).
12. Hugot, J.-P. et al. Association of NOD2 leucine-rich repeat variants with susceptibility to Crohn's disease. Nature 411, 599-603 (2001).
13. Kaser, A. & Blumberg, R. S. Autophagy, microbial sensing, endoplasmic reticulum stress, and epithelial function in inflammatory bowel disease. Gastroenterology 140, 1738-1747 (2011).
14. Cadwell, K. et al. A key role for autophagy and the autophagy gene Atg16l1 in mouse and human intestinal Paneth cells. Nature 456, 259-263 (2008).
15. Kaser, A. et al. XBP1 links ER stress to intestinal inflammation and confers genetic risk for human inflammatory bowel disease. Cell 134, 743-756 (2008).
16. Travassos, L. H. et al. Nod1 and Nod2 direct autophagy by recruiting ATG16L1 to the plasma membrane at the site of bacterial entry. Nature Immunology 11, 55-62 (2010).
17. Duerr, R. H. et al. A genome-wide association study identifies IL23R as an inflammatory bowel disease gene. Science 314, 1461-1463 (2006).
18. Tremelling, M. et al. IL23R variation determines susceptibility but not disease phenotype in inflammatory bowel disease. Gastroenterology 132, 1657-1664 (2007).
19. Taylor, K. D. et al. IL23R haplotypes provide a large population attributable risk for Crohn's disease. Inflammatory Bowel Diseases 14, 1185-1191 (2008).
20. Turner, J. R. Intestinal mucosal barrier function in health and disease. Nature Reviews Immunology 9, 799-809 (2009).
21. Okumura, R. & Takeda, K. The role of the mucosal barrier system in maintaining gut symbiosis to prevent intestinal inflammation. Seminars in Immunopathology 47(1) (2025).
22. Pelaseyed, T. et al. The mucus and mucins of the goblet cells and enterocytes provide the first defense line of the gastrointestinal tract and interact with the immune system. Immunological Reviews 260, 8-20 (2014).
23. Kaczmarek-Ryś, M. et al. Crohn’s disease susceptibility and onset are strongly related to three NOD2 gene haplotypes. Journal of Clinical Medicine 10, 3777 (2021).
24. Horowitz, J. E. et al. Mutation spectrum of NOD2 reveals recessive inheritance as a main driver of Early Onset Crohn’s Disease. Scientific Reports 11, 5595 (2021).
25. Hampe, J. et al. A genome-wide association scan of nonsynonymous SNPs identifies a susceptibility variant for Crohn disease in ATG16L1. Nature Genetics 39, 207-211 (2007).
26. Rioux, J. D. et al. Genome-wide association study identifies new susceptibility loci for Crohn disease and implicates autophagy in disease pathogenesis. Nature Genetics 39, 596-604 (2007).
27. McCarroll, S. A. et al. Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn's disease. Nature Genetics 40, 1107-1112 (2008).
28. Parkes, M. et al. Sequence variants in the autophagy gene IRGM and multiple other replicating loci contribute to Crohn's disease susceptibility. Nature Genetics 39, 830-832 (2007).
29. Murthy, A. et al. A Crohn’s disease variant in Atg16l1 enhances its degradation by caspase 3. Nature 506, 456-462 (2014).
30. Cadwell, K. et al. Virus-plus-susceptibility gene interaction determines Crohn's disease gene Atg16L1 phenotypes in intestine. Cell 141, 1135-1145 (2010).
31. Singh, S. B. et al. Human IRGM induces autophagy to eliminate intracellular mycobacteria. Science 313, 1438-1441 (2006).
32. Lapaquette, P. et al. Crohn's disease‐associated adherent‐invasive E. coli are selectively favoured by impaired autophagy to replicate intracellularly. Cellular Microbiology 12, 99-113 (2010).
33. Cooney, R. et al. NOD2 stimulation induces autophagy in dendritic cells influencing bacterial handling and antigen presentation. Nature Medicine 16, 90-97 (2010).
34. Homer, C. R. et al. ATG16L1 and NOD2 interact in an autophagy-dependent antibacterial pathway implicated in Crohn's disease pathogenesis. Gastroenterology 139, 1630-1641 (2010).
35. Adolph, T. E. et al. Paneth cells as a site of origin for intestinal inflammation. Nature 503, 272-276 (2013).
36. Mezghiche, I. et al. Interleukin 23 receptor: Expression and regulation in immune cells. European Journal of Immunology 54, 2250348 (2024).
37. Birchenough, G. M. et al. New developments in goblet cell mucus secretion and function. Mucosal Immunology 8, 712-719 (2015).
38. Johansson, M. E. et al. The inner of the two Muc2 mucin-dependent mucus layers in colon is devoid of bacteria. Proceedings of the National Academy of Sciences 105, 15064-15069 (2008).
39. Atuma, C. et al. The adherent gastrointestinal mucus gel layer: thickness and physical state in vivo. American Journal of Physiology-Gastrointestinal and Liver Physiology 280, G922-G929 (2001).
40. Ermund, A. et al. Studies of mucus in mouse stomach, small intestine, and colon. I. Gastrointestinal mucus layers have different properties depending on location as well as over the Peyer's patches. American Journal of Physiology-Gastrointestinal and Liver Physiology 305, G341-G347 (2013).
41. Pullan, R. D. et al. Thickness of adherent mucus gel on colonic mucosa in humans and its relevance to colitis. Gut 35, 353-359 (1994).
42. Strugala, V., Dettmar, P. W., & Pearson, J. P. Thickness and continuity of the adherent colonic mucus barrier in active and quiescent ulcerative colitis and Crohn’s disease. International Journal of Clinical Practice 62, 762-769 (2008).
43. Johansson, M. E. et al. Bacteria penetrate the normally impenetrable inner colon mucus layer in both murine colitis models and patients with ulcerative colitis. Gut 63, 281-291 (2014).
44. Gersemann, M. et al. Differences in goblet cell differentiation between Crohn's disease and ulcerative colitis. Differentiation 77, 84-94 (2009).
45. Wang, Z. & Shen, J. The role of goblet cells in Crohn’s disease. Cell & Bioscience 14, 43 (2024).
46. Masselot, C. R. et al. Fecal mucin O-glycans as novel biomarkers in inflammatory bowel diseases. Inflammatory Bowel Diseases 29, e12-e12 (2023).
47. Larsson, J. M. H. et al. Altered O-glycosylation profile of MUC2 mucin occurs in active ulcerative colitis and is associated with increased inflammation. Inflammatory Bowel Diseases 17, 2299-2307 (2011).
48. Van der Sluis, M. et al. Muc2-deficient mice spontaneously develop colitis, indicating that MUC2 is critical for colonic protection. Gastroenterology 131, 117-129 (2006).
49. Sidiq, T. et al. Nod2: a critical regulator of ileal microbiota and Crohn’s disease. Frontiers in Immunology 7, 367 (2016).
50. Wehkamp, J. et al. Reduced Paneth cell α-defensins in ileal Crohn's disease. Proceedings of the National Academy of Sciences 102, 18129-18134 (2005).
51. Ogura, Y. et al. Expression of NOD2 in Paneth cells: a possible link to Crohn’s ileitis. Gut 52, 1591-1597 (2003).
52. Rieder, F., Fiocchi, C., & Rogler, G. Mechanisms, management, and treatment of fibrosis in patients with inflammatory bowel diseases. Gastroenterology 152, 340-350 (2017).
53. Cleynen, I. et al. Inherited determinants of Crohn's disease and ulcerative colitis phenotypes: a genetic association study. The Lancet 387, 156-167 (2016).
54. Jung, C. et al. Genotype/phenotype analyses for 53 Crohn’s disease associated genetic polymorphisms. PLoS One 7, e52223 (2012).
55. Adler, J. et al. The prognostic power of the NOD2 genotype for complicated Crohn's disease: a meta-analysis. The American Journal of Gastroenterology 106, 699-712 (2011).
56. Alvarez-Lobos, M. et al. Crohn's disease patients carrying Nod2/CARD15 gene variants have an increased and early need for first surgery due to stricturing disease and higher rate of surgical recurrence. Annals of Surgery 242, 693-700 (2005).
57. Anzalone, A. V. et al. Search-and-replace genome editing without double-strand breaks or donor DNA. Nature 576, 149-157 (2019).
58. Schene, I. F. et al. Prime editing for functional repair in patient-derived disease models. Nature Communications 11, 5352 (2020).
59. Hisamatsu, T. et al. CARD15/NOD2 functions as an antibacterial factor in human intestinal epithelial cells. Gastroenterology 124, 993-1000 (2003).
60. Geremia, A. et al. IL-23-responsive innate lymphoid cells are increased in inflammatory bowel disease. Journal of Experimental Medicine 208, 1127-1133 (2011).
61. Ziblat, A. et al. Interleukin (IL)-23 stimulates IFN-γ secretion by CD56bright natural killer cells and enhances IL-18-driven dendritic cells activation. Frontiers in Immunology 8, 1959 (2018).
62. Sun, R., Hedl, M., & Abraham, C. IL23 induces IL23R recycling and amplifies innate receptor-induced signalling and cytokines in human macrophages, and the IBD-protective IL23R R381Q variant modulates these outcomes. Gut 69, 264-273 (2020).
63. Oppmann, B. et al. Novel p19 protein engages IL-12p40 to form a cytokine, IL-23, with biological activities similar as well as distinct from IL-12. Immunity 13, 715-725 (2000).
64. Fujino, S. et al. Increased expression of interleukin 17 in inflammatory bowel disease. Gut 52, 65-70 (2003).
65. Brand, S. et al. IL-22 is increased in active Crohn’s disease and promotes proinflammatory gene expression and intestinal epithelial cell migration. American Journal of Physiology-Gastrointestinal and Liver Physiology 290, G827-G838 (2006).
66. Schmidt, C. et al. Expression of interleukin-12-related cytokine transcripts in inflammatory bowel disease: elevated interleukin-23p19 and interleukin-27p28 in Crohn's disease but not in ulcerative colitis. Inflammatory Bowel Diseases 11, 16-23 (2005).
67. Masaki, S. et al. NOD2-mediated dual negative regulation of inflammatory responses triggered by TLRs in the gastrointestinal tract. Frontiers in Immunology 15, 1433620 (2024).
68. Cadwell, K. et al. A common role for Atg16L1, Atg5, and Atg7 in small intestinal Paneth cells and Crohn disease. Autophagy 5, 250-252 (2009).
69. Rutgeerts, P. et al. Efficacy of ustekinumab for inducing endoscopic healing in patients with Crohn’s disease. Gastroenterology 155, 1045-1058 (2018).
70. McGovern, D. P. et al. Fucosyltransferase 2 (FUT2) non-secretor status is associated with Crohn's disease. Human Molecular Genetics 19, 3468-3476 (2010).
71. Barrett, J. C. et al. Genome-wide association study of ulcerative colitis identifies three new susceptibility loci, including the HNF4A region. Nature Genetics 41, 1330-1334 (2009).
72. Fisher, S. A. et al. Genetic determinants of ulcerative colitis include the ECM1 locus and five loci implicated in Crohn's disease. Nature Genetics 40, 710-712 (2008).
73. Wang, M.-H. et al. Su1748 A Novel Approach to Detect Cumulative Genetic Effects and Genetic Interactions in Crohn's Disease. Gastroenterology 5, S-466 (2013).
74. Viladomiu, M. et al. Agr2-associated ER stress promotes adherent-invasive E. coli dysbiosis and triggers CD103+ dendritic cell IL-23-dependent ileocolitis. Cell Reports 41(7) (2022).
75. Hue, S. et al. Interleukin-23 drives innate and T cell–mediated intestinal inflammation. The Journal of Experimental Medicine 203, 2473-2483 (2006).
76. McGeachy, M. J., Cua, D. J., & Gaffen, S. L. The IL-17 family of cytokines in health and disease. Immunity 50, 892-906 (2019).
77. Keir, M. E. et al. The role of IL-22 in intestinal health and disease. Journal of Experimental Medicine 217, e20192195 (2020).
78. Singh, A. et al. IL-22 promotes mucin-type O-glycosylation and MATH1+ cell-mediated amelioration of intestinal inflammation. Cell Reports 43(5) (2024).
79. Sandborn, W. J. et al. Tofacitinib, an oral Janus kinase inhibitor, in active ulcerative colitis. New England Journal of Medicine 367, 616-624 (2012).
80. Honap, S., Irving, P. M., & Samaan, M. A. JAK inhibitors for the treatment of inflammatory bowel disease: results of an international survey of perceptions, attitudes, and clinical practice. European Journal of Gastroenterology & Hepatology 35, 1270-1277 (2023).
81. Loftus Jr, E. V. et al. Upadacitinib induction and maintenance therapy for Crohn’s disease. New England Journal of Medicine 388, 1966-1980 (2023).
82. Sandborn, W. J. et al. Vedolizumab as induction and maintenance therapy for Crohn's disease. New England Journal of Medicine 369, 711-721 (2013).
83. Louis, E. et al. Maintenance of remission among patients with Crohn's disease on antimetabolite therapy after infliximab therapy is stopped. Gastroenterology 142, 63-70 (2012).
84. Kennedy, N. A. et al. Relapse after withdrawal from anti-TNF therapy for inflammatory bowel disease: an observational study, plus systematic review and meta-analysis. Alimentary Pharmacology & Therapeutics 43, 910-923 (2016).
85. Palmieri, O. et al. Crohn’s disease localization displays different predisposing genetic variants. PLoS One 12, e0168821 (2017).
86. To, N., Gracie, D. J., & Ford, A. C. Systematic review with meta-analysis: the adverse effects of tobacco smoking on the natural history of Crohn's disease. Alimentary Pharmacology & Therapeutics 43, 549-561 (2016).
87. Cosnes, J. et al. Smoking cessation and the course of Crohn's disease: an intervention study. Gastroenterology 120, 1093-1099 (2001).
88. Hu, S. et al. Inflammation status modulates the effect of host genetic variation on intestinal gene expression in inflammatory bowel disease. Nature Communications 12, 1122 (2021).
89. Nishiyama, N. C. et al. eQTL in diseased colon tissue identifies novel target genes associated with IBD. bioRxiv (2024).
90. Momozawa, Y. et al. IBD risk loci are enriched in multigenic regulatory modules encompassing putative causative genes. Nature Communications 9, 2427 (2018).
91. Polyak, S. et al. Gene delivery to intestinal epithelial cells in vitro and in vivo with recombinant adeno-associated virus types 1, 2 and 5. Digestive Diseases and Sciences 53, 1261-1270 (2008).
92. Xin, S. et al. Inflammation accelerating intestinal fibrosis: from mechanism to clinic. European Journal of Medical Research 29, 335 (2024).
