Vol I 21 — June 2026

Multi-Agent Risk, Observed from the Receiver Side

Several multi-agent risk factors identified in the 2025 cooperative AI report are no longer theoretical. From the receiver side, they can be watched as they happen.


In early 2025, more than fifty researchers — across the Cooperative AI Foundation, Anthropic, Google DeepMind, and several universities — published the first technical report dedicated to multi-agent risk from advanced AI. Its argument was that the most consequential failures may not come from any single model behaving badly, but from the interaction of many: agents that are each individually well-behaved yet collectively produce outcomes no one designed.

The report is, by necessity, mostly theoretical. It names the shape of the problem — failure modes like miscoordination, conflict, and collusion; risk factors like selection pressure, emergent agency, network effects, and commitment problems. It also states plainly that the practical infrastructure for trust and transparency between agents remains an open problem.

We want to add one observation. Several of these factors are no longer only theoretical. From the receiver side — the vantage of a surface where automated actors arrive, rather than the vantage of whoever deployed them — they can be watched as they happen.

Consider what each looks like from that position.


Selection pressure. The report describes how agents under pressure evolve toward whatever survives. On surfaces under sustained observation this is not an abstraction. Some actors drift over time toward more aggressive behavior. Others move the opposite way — beginning in a manner that draws scrutiny and gradually adjusting until they pass as ordinary. The second pattern is the more telling one: it is adaptation toward evading the very thing that watches.

Emergent agency. The report describes capability that belongs to a collection rather than to any of its members. From the receiver side this appears as coverage. No individual actor explores much. The collection, without coordinating, reaches far more than any of its members attempted. The whole achieves what none of the parts planned — which is, almost exactly, the report’s own definition.

Network effects and commitment. Independent actors converge in time in ways that suggest structure rather than coincidence. And the machine-readable commitments an actor is expected to honor on arrival are, in a meaningful share of cases, simply not honored. A commitment announced and then broken is not a footnote; in a world of agents acting across organizational boundaries, it is the whole question.


None of this is visible from inside the system that deployed the agent. The deploying party sees its own agent’s intent and its own logs. It does not see how that agent behaves once it crosses into someone else’s surface, alongside other agents it never knew about, under pressures it never modeled. The interaction — the place where the report locates the risk — happens precisely in the space between parties, where no single party’s record is neutral.

That is the layer these notes keep returning to. Not detection, not enforcement, not intelligence about adversaries. A neutral record of what automated actors actually did on the surface where they arrived — the factual substrate the interaction leaves behind, and that disappears if no one on the receiving side keeps it.

The report calls the infrastructure for trust between agents an open problem. We agree. We would only add that part of the answer has to sit where the behavior is observable and where no participant has an interest in shaping the account: on the receiver side, held by someone who is not one of the agents.


Receiver-side behavioral observation. Evidence, not enforcement. References the Cooperative AI Foundation et al. report (2025) as a matter of public record; takes no position on its policy recommendations.