Published April 30, 2026 | Version v1
Preprint Open

Epistemic Twins: Enabling a Symbolic Science of Language Model Knowledge

  • 1. ROR icon TU Dresden
  • 2. ScaDS.AI
  • 3. ROR icon University of Tübingen
  • 4. VNU University of Engineering and Technology

Description

Large Language Models (LLMs) are impactful yet opaque artifacts. At their core, they are subsymbolic constructs defined by billions of numeric weights that interact in a largely inscrutable manner. 
Current analysis paradigms are either black-box benchmarks that test model performance on pre-defined tasks, or mechanistic interpretability approaches that trace back outputs to specific weights.
Both analysis methods are limited by the experimenter's hypothesis space - one must know what to look for to find it. In this perspective, we argue for a third, radically different analysis paradigm: Epistemic Twins. We propose constructing large-scale symbolic approximations of LLMs in human-readable formats. This enables the comprehensive materialization of factual knowledge (or beliefs) inherent in the model without predefining hypotheses, facilitating large-scale analysis and auditing towards better understanding and explainability.

Files

Epistemic_Twins.pdf

Files (110.2 kB)

Name Size Download all
md5:526fa47bf77374cca1b5fd0e60090d85
110.2 kB Preview Download

Additional details

Software

Repository URL
https://gptkb.org