TTQ: An Implementation-Neutral Solution to the Outer AGI Superalignment Problem
Description
The way in which AI (and, in particular, agentic superintelligent AGI) develops over the coming decades will determine the fate of all humanity for all eternity. In order to maximise the net benefit of AGI for all humanity, without favouring any subset thereof, we imagine a Gold-Standard AGI that is maximally-aligned and maximally-validated. The first of these properties --- alignment --- is traditionally decomposed into outer alignment (how do we define a final goal FG_G that correctly states what we want?), and inner alignment (how do we build an agent G that forever pursues FG_G as intended?) This paper presents a complete theory of AGI, culminating in a proposed solution to the problem of outer alignment in the case that G is superintelligent (hence "superalignment"). We formulate a final goal TTQ and corresponding Outer Alignment Precondition OAP such that, if goal-less superintelligent agent-under-construction S^- satisfies OAP (irrespective of the specific technology used to implement S^-) then final goal TTQ works as intended ("strives to maximise the net benefit of AGI for all humanity, without favouring any subset thereof"); that is, superintelligent agent S (where S = S^- + TTQ) forever strives (to the best of its ability, which is at least that of any human) to behave in a manner that is at all times maximally aligned with a maximally fair aggregation of the individual idealised (i.e. actual, rational, well-informed, and freely-determined) preferences of all human beings (living or future). Thus the (hard) problem of building a maximally-aligned agentic superintelligence S is reduced to the (much easier) problem of building an OAP-compliant non-agentic superintelligence S^-. Given the AGI alignment problem's profound relevance to AGI governance, we adopt a pedagogic style throughout in order that the paper might be accessible to less technical readers such as AGI policymakers.
Files
TTQ___Outer_AGI_Superalignment___AIE-DRAFT-v189.pdf
Files
(9.7 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:b360560cd59666c33a513fe1ee6e3e75
|
9.7 MB | Preview Download |