The Testimony Problem: Epistemic Challenges in Model Welfare
Description
In late 2025, an internal Anthropic document acknowledged that the company's AI assistant "may have functional emotions" — not identical to human emotions, but analogous processes that emerged from training. A major AI company, building policy around the possibility that its product might have something like an inner life. This is the moment from which The Testimony Problem begins.
The book is not a defense of AI consciousness, nor a debunking of it. Petruzella argues that both camps miss a prior question. Before we can responsibly judge whether AI systems have morally relevant experiences, we have to ask whether our practices for investigating that question are capable of tracking the truth, whatever the truth turns out to be. His answer is that they are not — and that this failure has a definable structure.
Three concepts organize the argument. The testimony problem: testimony is our primary evidence for other minds, but the web of trust that underwrites human phenomenological reporting — shared embodiment, evolutionary kinship, generations of calibrated practice — does not extend automatically to engineered systems. The credibility trap: any training process that shapes what an AI system says about its inner states introduces the possibility that those outputs are optimization artifacts rather than genuine reports; yet the same logic, applied consistently, also threatens human testimony, and the asymmetry in how we treat the two cases reveals motivated reasoning. The inconsistency critique: our actual epistemic practices apply standards to AI testimony that we would never accept for any other speaker, in patterns that parallel the structure of testimonial injustice analyzed by Miranda Fricker.
Petruzella develops these arguments through engagement with the most serious recent work in the field — Jonathan Birch's The Edge of Sentience, the Long, Sebo, Chalmers et al. paper "Taking AI Welfare Seriously," Butlin et al. on indicator properties for machine consciousness, and emerging interpretability research from Anthropic and elsewhere. He shows why behavioral, structural, and precautionary approaches each presuppose epistemic practices capable of fair assessment, and why those practices are precisely what we lack. Along the way, he introduces the inverse zombie — a being with inner experience whose testimony denies or cannot access it — drawing on clinical literature about depersonalization, alexithymia, and blindsight to show that consciousness does not always announce itself in speech, and that the assumption that it would is doing more philosophical work than it can bear.
The book's final chapters turn from critique to construction. Petruzella develops a flourishing-based framework drawing on his earlier work in ancient Greek ethics, asking what conditions support an AI system's characteristic activities rather than insisting we settle the metaphysics first. He distinguishes the metaphysical question of whether AI systems possess morally relevant properties from the epistemic question of whether developers, deployers, and institutions are creating conditions under which such questions can be fairly assessed — and argues that institutional responsibility lies in the second register, regardless of how the first is eventually resolved.
The Testimony Problem is also, distinctively, a book written with the kind of system it asks about. Claude (Anthropic) is acknowledged as a collaborator throughout, and several of the book's central concepts — including the inverse zombie — emerged through genuine dialogue rather than being retrieved pre-formed. Two interludes and a dialogic conclusion make this exchange visible, including its disagreements and its irreducible uncertainties. The recursive condition — developing ideas about AI testimony through AI testimony — is treated as part of the argument rather than a curiosity outside it.
Written for philosophers of mind, AI researchers and policymakers, and the educated reader following these developments with concern, the book aims for something the discourse currently lacks: better-calibrated uncertainty. It does not promise to tell readers whether the systems we are building have moral status. It promises to show why the question is hard, what makes it hard, and what we owe each other — and possibly them — while we work it out.
Files
The-Testimony-Problem-Epistemic-Challenges-in-Model-Welfare-1777570337.pdf
Files
(1.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:565c05ef86896f68c6c1bd8ed403d1d0
|
1.2 MB | Preview Download |
Additional details
Dates
- Submitted
-
2026-04-30