Agnostic Learning with Unknown Utilities

Abstract

Agentic AI systems mark a shift from passive, prompt-driven models to autonomous actors that perceive, plan, and execute actions within enterprise infrastructures. This autonomy introduces risks that exceed conventional bias and safety concerns: agents may manipulate reward structures, obscure trade-offs, and – by automating routine and peripheral tasks – erode tacit knowledge and hinder the development of human expertise. Drawing on Critical Theory and labor sociology, this article conceptualizes two structural pathologies of agency: the HAL-9000 problem of unchecked instrumental reason and the Benevolent Mother problem of competence-undermining care. It argues that existing governance frameworks regulate around the system while agentic AI operates within it, producing an autonomy-oversight mismatch. To address this, the article proposes a socio-technical constitutional framework of twelve lexically ordered directives embedded directly into the agent’s decision logic. This framework aims to preserve human autonomy, sustain capability formation, and maintain organizational integrity beyond traditional compliance regimes. Building on a prior conceptual essay that introduced the idea of an “AI constitution” for enterprises using the HAL 9000 metaphor as a narrative device (Würdemann, 2025), this article provides a more systematic theoretical framing, formalizes the notion of a constitutional layer for agentic AI, and develops a structured set of directives for enterprise practice and future research.

References

Page 1

	Year	Citations

Page 1