When Metrics Lie: A Longitudinal Audit Framework for Detecting Pluralism Erosion in Human–AI Sociotechnical Systems

Lizzio, Andrew Gerard

doi:10.5281/zenodo.19964833

Published April 2026 | Version v7

Preprint Open

When Metrics Lie: A Longitudinal Audit Framework for Detecting Pluralism Erosion in Human–AI Sociotechnical Systems

Lizzio, Andrew Gerard (Researcher)

Contemporary AI evaluation measures model-level properties, safety, helpfulness, bias, at discrete points in time. These approaches share a structural blind spot: they cannot detect the longitudinal erosion of the most valuable asset humanity has ever produced.

Human diversity is not a demographic statistic. It is the accumulated creative capital of thousands of years of conflict, friction, migration, and exchange. It is the fuel of civilisational creativity, and it was built at enormous cost. Minority populations and communities of the Global South carry disproportionate shares of this diversity, precisely because they have been least absorbed into dominant knowledge systems. AI systems, trained predominantly on majority-population data and optimised for aggregate engagement, are eroding this catalog at pace. We cannot value what we cannot measure. We cannot protect what we cannot value.

This paper proposes a remedy. The Humane Intelligence Quotient (HIQ) is a longitudinal audit framework designed to serve as the measurement engine for an AI alignment and ethics leaderboard: a public, replicable, empirically grounded mechanism for evaluating AI systems not on capability alone, but on their capacity to preserve and enrich human diversity over time. The framework is organised around three measurable dimensions: diversity of behavioural repertoires, connectivity of value circulation across populations, and temporal agility in adaptation.

The case rests on empirical ground. Analysing 529,428 real-world human–AI conversations across eight months, conventional engagement indicators suggested improvement. HIQ structural indicators told a different story: a statistically significant decline in behavioural diversity (slope = −0.012, p = 0.020) and misleading connectivity gains driven by the differential collapse of minority-population input diversity. When you measure what actually matters, you find things that standard metrics are structurally incapable of seeing. The full analysis pipeline and code are publicly available for independent replication.

Files

Humane Intelligence - v8.4 - Main.pdf

Files (524.9 kB)

Name	Size	Download all
Humane Intelligence - v8.4 - Main.pdf md5:311ba972fb78eae05e51ef216610bcfc	266.1 kB	Preview Download
Humane Intelligence - v8.4 - Supplementary.pdf md5:fdacb4c1aeed53672400d9f7ed311112	258.9 kB	Preview Download

Additional details

Submitted: 2025-06-06

Initial
Updated: 2025-07-13

Significant updates
Updated: 2026-04

Significant updates

	All versions	This version
Views	218	36
Downloads	151	36
Data volume	65.8 MB	10.6 MB

When Metrics Lie: A Longitudinal Audit Framework for Detecting Pluralism Erosion in Human–AI Sociotechnical Systems

Authors/Creators

Description

Files

Humane Intelligence - v8.4 - Main.pdf

Files (524.9 kB)

Additional details

Dates