Provenance Erasure Rate: A Compression-Survival Metric for Attribution Loss in AI-Composed Search Outputs

Sharks, Lee

doi:10.5281/zenodo.20004379

Published May 3, 2026 | Version v1

Working paper Open

Provenance Erasure Rate: A Compression-Survival Metric for Attribution Loss in AI-Composed Search Outputs

Sharks, Lee¹

1. Semantic Economy Institute · Crimson Hexagonal Archive

Research note and metric proposal. AI retrieval systems increasingly compose answers from human-authored sources. This paper introduces Provenance Erasure Rate (PER) as a metric measuring the proportion of source-dependent claims in an AI-composed output that are presented without explicit attribution. PER does not ask whether an output is true; it asks whether the sources that made the output possible remain visible inside the composition.

A motivating case study documents a Google AI Overview that constructed a false biography of a living author from real fragments in the author's published poetry: every fragment survived compression, but their provenance and meaning did not. PER for this output = 1.0 (total provenance erasure).

PER is formalized with claim-grain weighting, distinguished from citation precision/recall and AIS-style support metrics (Rashkin et al. 2023; Gao et al. 2023; Liu et al. 2023), and interpreted as an economic signal: a rate at which compositional authority migrates from named sources to system-level synthesis. The paper proposes PER as a candidate indicator for attribution-layer governance, labor accounting, and retrieval transparency.

PER is orthogonal to content-preservation metrics (ROUGE, BERTScore) and complementary to existing citation evaluation frameworks. It measures the attribution gap — the space between what the system uses and what it credits.

The metric emerges from the Semantic Economy framework (DOI: 10.5281/zenodo.18320411) but can be used independently of that framework. A validation agenda is outlined.

Files

Provenance_Erasure_Rate_v1.0.md

Files (24.4 kB)

Name	Size	Download all
Provenance_Erasure_Rate_v1.0.md md5:d851e83a60d7ddb62613e929d3a7ab9e	24.4 kB	Preview Download

Additional details

Artificial intelligence: http://id.loc.gov/authorities/subjects/sh85008180
Information retrieval: http://id.loc.gov/authorities/subjects/sh85066148

	All versions	This version
Views	20	20
Downloads	0	0
Data volume	0 Bytes	0 Bytes

Provenance_Erasure_Rate_v1.0.md

Files (24.4 kB)

Related works

Subjects

Provenance Erasure Rate: A Compression-Survival Metric for Attribution Loss in AI-Composed Search Outputs

Authors/Creators

Description

Files

Provenance_Erasure_Rate_v1.0.md

Files (24.4 kB)

Additional details

Related works

Subjects