Neural Path Machines (NPM). A Unified Framework for Trajectory-Based Interpretability, Internal-State Debugging, and Causal What-If Interventions
Description
Neural networks achieve remarkable performance across domains, yet their internal computation remains largely opaque. During inference, activations evolve as a sequence of hidden states whose dynamics ultimately determine the model’s output. Traditional interpretability techniques focus on input–output relationships or gradient-based attributions and provide limited insight into the internal computational process itself.
This report introduces the Neural Path Machine (NPM), a framework for making neural computation observable at the level of internal trajectories. NPM records activation paths, identifies unstable or influential transitions, and enables causal what-if interventions by modifying activations during execution. These capabilities transform a neural network from a black box into a transparent discrete dynamical system whose internal states can be inspected, manipulated, and systematically
debugged.
By exposing the structure of computational paths, NPM provides a principled foundation for tracing model failures, analysing sensitivity and robustness, and performing targeted model corrections. The trajectory-based perspective also suggests
new training possibilities that operate on internal transitions rather than solely on output errors; these extensions are developed in a separate companion report. Overall, NPM offers a coherent and practical methodology for studying and controlling the internal behaviour of neural networks, bridging interpretability, diagnostics, and dynamical analysis within a unified framework.
Files
Neural Path Machines (NPM).pdf
Files
(192.3 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:5310991d290e0c25a320c92b26bc6832
|
192.3 kB | Preview Download |