The Non-Delegable Core: Designing Legitimate Oversight for Agentic AI
Authors/Creators
Description
As artificial intelligence systems become increasingly autonomous and assume oversight roles over other AI systems, traditional models of governance are rapidly eroding. This paper introduces the concept of the non-delegable core—governance functions that must remain under human authority not because AI lacks technical capability, but because democratic legitimacy requires it. We identify an Accountability-Capability Paradox, where AI systems' very success in surpassing human capacity undermines our ability to oversee them meaningfully, and propose the Human-AI Governance (HAIG) framework—a dimensional model that reconceives oversight along three axes: decision authority, process autonomy, and accountability configuration. Rather than defaulting to recursive AI-monitoring-AI hierarchies that obscure responsibility and invite failure, HAIG establishes adaptive trust thresholds to maintain human comprehensibility and control where it matters most. We illustrate HAIG-enabled anticipatory, flexible, and stakeholder-responsive governance scenarios in critical domains like medical triage, autonomous vehicles, and content moderation. The paper concludes with policy recommendations and institutional innovations—including AI audit courts and algorithmic juries—that support hybrid governance systems capable of sustaining democratic legitimacy in the age of agentic AI.
Files
The Non-Delegable Core_2025 06 10_Final_Pre-print_Zenodo.pdf
Files
(225.9 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:051de241a22ea140f2c94acd5c9397d6
|
225.9 kB | Preview Download |
Additional details
Related works
- Is variant form of
- Preprint: arXiv:2505.01651 (arXiv)
- Preprint: arXiv:2505.11579 (arXiv)
Dates
- Submitted
-
2025-06-10
References
- Bengio, Y., Hinton, G., Yao, A., Song, D., Abbeel, P., Darrell, T., ... & Mindermann, S. (2024). Managing extreme AI risks amid rapid progress. Science, 384(6698), 842-845. https://doi.org/10.1126/science.adn0117
- Clark, A. (2025). Extending Minds with Generative AI. Nature Communications, 16, 4627. https://doi.org/10.1038/s41467-025-59906-9
- Cohen, M. K., Kolt, N., Bengio, Y., Hadfield, G. K., & Russell, S. (2024). Regulating advanced artificial agents. Science, 384(6691), 36-38. https://doi.org/10.1126/science.adl0625
- Danaher, J. (2016). The threat of algocracy: Reality, resistance and accommodation. Philosophy & technology, 29(3), 245-268. https://doi.org/10.1007/s13347-015-0211-1
- Engin, Z. (2025). Human-AI Governance (HAIG): A Trust-Utility Approach. arXiv preprint arXiv:2505.01651. https://doi.org/10.48550/arXiv.2505.01651
- Engin, Z., & Hand, D. (2025). Toward Adaptive Categories: Dimensional Governance for Agentic AI. arXiv preprint arXiv:2505.11579.
- Floridi, L. (2025). AI as Agency without Intelligence: On Artificial Intelligence as a New Form of Artificial Agency and the Multiple Realisability of Agency Thesis. Philosophy & Technology, 38(1), 30. https://doi.org/10.1007/s13347-025-00858-9
- Kolt, N. (2025). Governing AI agents. arXiv preprint arXiv:2501.07913. https://doi.org/10.48550/arXiv.2501.07913
- Qiu, S., Liu, Q., Zhou, S., & Wu, C. (2019). Review of artificial intelligence adversarial attack and defense technologies. Applied Sciences, 9(5), 909. https://doi.org/10.3390/app9050909
- Rahwan, I., Cebrian, M., Obradovich, N., Bongard, J., Bonnefon, J. F., Breazeal, C., ... & Wellman, M. (2019). Machine behaviour. Nature, 568(7753), 477-486. https://doi.org/10.1038/s41586-019-1138-y
- Vallor, S. (2016). Technology and the virtues: A philosophical guide to a future worth wanting. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780190498511.001.0001
- Winner, L. (2017). Do artifacts have politics?. In Computer ethics (pp. 177-192). Routledge. https://www.taylorfrancis.com/chapters/edit/10.4324/9781315259697-21/artifacts-politics-langdon-winner