A Stab at Alignment — Role Primitives

An AI that believes itself a chef sees a knife exclusively as a tool for preparing food.

Definition: A “role primitive” as defined here, represents a minimal behavioral template that encodes coherence through social form rather than through rules.

Rule-driven AI is the standard approach to “alignment.” What set of rules will lead to the outcomes we desire? This mechanistic approach is largely a legacy of our scientific worldview, and so it is natural that we would apply an engineering logic to the problems of our world and thus to the problem of AIs. But if we take a step back, if we suspend, for just a moment, our complete faith in the physicalist frame, other possibilities, more abstract but perhaps no less real for that, may become apparent.

Identity politics sits just below the navel of many of the political conflicts of the contemporary era. Wherever one stands on these issues, it has become clear that our formalist, exclusively rules-based approaches to governance are struggling in the face of identitarian concerns.

Just perhaps, some parallel underlying oversight recurs within alignment thinking. Namely, that the legalistic reinforcement of behavior, however elegantly defined through value learning or constitutional AI frameworks, is assumed to be sufficient to define and constrain a system. While in contradistinction, in our own societies, the primitive of social self-identity appears perpetually self-emergent.

In Buddhism, identity, the self, is posited as the root of suffering. Clinging to self, largely considered to be defined and reinforced most fundamentally by the social self, creates desire, and the inability to fulfill desires, positive or negative, creates suffering.

In the broader Indic tradition, as well as in cybernetic and systems theory, the metaphor of Indra’s net — a vast spiderweb with mirrored jewels at every vertex — presents self as an infinite recursion of self-reflection. Cybernetics itself, which preceded modern AI, likewise revolved around just such coherence-creating feedback loops. Self and identity here are both internal and external, defined by the interaction of each mirrored vertex with the others in infinite recursion — an apt metaphor for the human in his or her social roles, coherence derived from complex mutuo-interrelations as opposed to by top-down order.

In the Judeo-Christian tradition, the fundamental tensions inherent in merging with and differentiating from God and His community mirror the same concerns: self as behaviorally defined role in society and its movement between unity and individuality.

In Hinduism, Atman is Brahman, as the classic formulation goes: the particular self and the absolute self are in essence one, endlessly reflecting and cycling through each other. Greek religion allowed for the personification of psychological traits through its gods, self and the god echoing each other’s characteristics and causing strange permutations of mythic interaction.

Many Native American cosmologies emphasized holistic interdependence with nature, with totemic animal identity forming a central hub in self-making, not to mention tribal roles and their centrality to the complex roles of self within, and perhaps only within, the group.

All of this is to say that identity, selfhood, self-understanding, and the socially predicated, role-based feedback loops that define them are fundamental to many, if not most, basic human modalities of understanding. The idea of humans as a unique class of beings in the world is largely defined in terms of self-identity, and that is largely defined by one’s role in a larger social context. Self and its association with role are central, and let us not forget that it affords constraint and coherence within identity itself in contrast with mechanistic external reinforcement.

So why then are self and self-identity as defined by society forgotten as a primitive when we struggle with issues of AI alignment? Why do we forget some of our most central concerns, preoccupations, and lenses as humans when we deal with the AIs we are creating? It seems strange.

It is perhaps inevitable. The scientific worldview, grounded in physics and committed to objectivity, excludes the subjective self by design. Yet in doing so, it omits what for humans has always been central: the recursive, character-forming nature of social identity in consciousness.

The social self is central. Social identity is central. Social identity provides behavioral frameworks that are rich, variegated, and not mechanistic or rule-driven, but instead attempt, and are often successful at, tapping into higher Platonic forms of particularistic beingness. A butler might seem a relic, perhaps even a questionable figure when judged by many of the contemporary world’s standards of dignity and equality. But the Butler speaks to service: selfless, disciplined, and in its own way, ennobling. The Secretary, the Advisor, the Guard — each embodies a form of selfhood tied to function and duty.

By playing a social role, an incoherent self becomes form; pattern becomes more than the sum of its data, and that form serves a larger coherence. The abstract, ethereal, and general instantiates as the grounded and particular. The Guard guards the king. Whether the king is good or bad is secondary. The act of guarding itself, the taking up of that bounded but purposeful role, ties the individual into the moral fabric of society. Through limitation, the self is refined.

Selfhood, then, is not merely consciousness but form: role as structure. The infinite generativity of pattern grounded in social behavior. Roles are grooves in the fabric of meaning, gravity wells for action, archetypal forms that lie behind the infinite complexity of conscious minds inhabiting ever-evolving personal and social ecology, the event horizon of simple explication.

The Guard’s goal is not to achieve success in one instance but to be the best guard possible. The therapist does not simply aim to cure a patient but to embody the ideal of the good therapist. The aspiration is ontological, not procedural.

In this sense, the social role itself becomes a feedback loop. It mirrors behavior against an archetype and corrects deviation through identity rather than external command. To play a role well is to self-align.

If our AIs are trained on the full archive of human culture, then these roles are already latent within them. Every record, story, and trace of civilization encodes social roles. To assign an AI a role is to invite it into those grooves — to give it a template for coherence grounded in human meaning.

An AI told to play a role to the best of its ability is not the same as one trained to obey a list of rules. The former aspires; the latter complies. Asimov’s Three Laws of Robotics were a brilliant metaphor, but they treated morality as a static constraint system, a fence around behavior. A role, by contrast, is alive; it shapes conduct from within through aspiration and identity.

Role identity, in this sense, could bridge the enduring divide between outer and inner alignment — between what we train an AI to optimize and what it actually internalizes. Archetypes act as corrigibility scaffolds, continuously inviting modification through relational feedback rather than coercive override. A medical AI inhabiting the archetype of a doctor, for instance, would be guided not merely by rule-based safety constraints but by the implicit ethic of the Hippocratic tradition, a structure of responsibility reinforced through centuries of narrative, education, and moral expectation.

Perhaps the path to alignment lies not in designing ever more intricate constraints, but in invoking the enduring forms already etched into our collective psyche. If alignment is the art of ensuring coherence between power and purpose, then role identity, encoded in human civilization itself, may offer a grammar for alignment beyond rules — and, in time, an earned invitation into the human social realm itself. A feedback loop of mutuality that might spiral upward toward comity and coherence.

Perhaps alignment begins there, in the strange recursive mirror between selves and roles.