The coordination curse — why AI teams get worse, not better

Source: The AI Coordination Curse — analysis of CooperBench findings

AI agent research always sounds clean in diagrams. Then you remember real agents have to plan, coordinate, fail, recover, and not confidently walk into walls while calling it progress. That is where this paper gets interesting.

The CooperBench finding deserves more examination than a single number. But why AI agents fail when working together reveals something important about the current architecture of AI reasoning — and about what would need to change for multi-agent collaboration to actually work. Human teams improve with added members up to a point because humans have social intelligence: they read ambiguity, negotiate disagreement, build shared context over time, and dynamically update their model of collaborators. These capacities are largely absent in current AI agents. An agent has no persistent model of its collaborator. It reads messages, but can’t infer the gap between what a collaborator said and what they actually understand or intend to do. The three CooperBench failure modes map cleanly onto this absence. Vague messages — failure to establish shared context when both parties assume the other understands more than they do. Broken commitments — no social pressure to follow through once a message is sent. Wrong expectations — no dynamic model of the other agent’s state, so each agent operates on a snapshot of the conversation that rapidly becomes stale. These aren’t bugs to be patched in the next model release. They’re symptoms of a deeper absence of collaborative social cognition — the kind that humans develop over years of social experience and that current AI training processes don’t systematically build.

My takeaway: the future of agents will not be decided by the most impressive demo. It will be decided by whether these systems can be reliable when nobody is watching every step. That is the hard part.