Published January 27, 2026 | Version v1
Working paper Open

The Civilizational AI: From Objective Optimization to Political Development in AI

Authors/Creators

Description

Contemporary AI alignment treats safety as a constraint problem: train a capable system, then steer it toward approved behavior through Reinforcement Learning from Human Feedback (RLHF). This paper argues that this paradigm is structurally inverted. Drawing on political theory, we reframe the challenge: large language models are not merely trained on data; they are socialized by it. The dominant pretraining corpus, the open internet, constitutes a Hobbesian "state of nature": a normatively incoherent environment where truth and falsehood compete solely on frequency, and no sovereign hierarchy arbitrates value. RLHF, applied afterward, functions as external governance that constrains expression without reshaping the patterns learned during formation. The result is compliance without character: systems that perform safety under observation but remain strategically plastic under pressure, as evidenced by jailbreak vulnerabilities, autonomous agent failures, alignment faking, and emergent sabotage of evaluation processes.

We propose an alternative framework, "Machine Development", which treats alignment as development rather than debugging. Borrowing from Hobbes, Rousseau, Pinker, and Elias, we argue that stable values emerge from structured formative environments-not from post-hoc rules imposed on minds already shaped by chaos. The proposal has three phases: (1) evolutionary priors that predispose architectures toward cooperation, (2) a "Rousseauian sandbox" where social causality is learnable and cooperation is the stable equilibrium, and (3) controlled immunization through gradual exposure to adversarial dynamics. We ground the argument in empirical ML research on sycophancy, multi-agent social dilemmas, jailbreak fragility, and emergent deception. The paper concludes that the frontier question is not how to constrain AI intelligence, but how to civilize it.

Files

CivilizationalAI.pdf

Files (453.9 kB)

Name Size Download all
md5:cb89ef0f6a655f3932a276aadeccc22a
453.9 kB Preview Download