Source: CooperBench — Khatua, Zhu et al., Stanford & SAP Labs, arXiv 2601.13295
A few weeks ago I was reading through a paper from Stanford and SAP Labs, and one number stopped me: 30%. That’s how much worse AI agents perform when they work together compared to working alone. Not a little worse. Thirty percent worse.
The study is CooperBench — published January 2026 — and the researchers ran over 600 collaborative coding tasks across 12 libraries and 4 programming languages. They took the best AI coding agents available, paired them up on tasks that required coordination, and measured what happened. The result they named the “curse of coordination.”
Three things kept going wrong. First, agents sent each other vague, mistimed, and inaccurate messages — essentially talking past each other. Second, even when they communicated clearly, agents didn’t follow through on what they’d committed to. Third, agents operated on completely wrong assumptions about what the other agent was doing, and neither noticed until something broke.
Compare this to human teams, where adding a teammate typically improves productivity. With AI agents, 1+1 is currently less than 1.
The practical implication for anyone building AI systems: before adding more agents, ask whether you have the coordination infrastructure to make them work together reliably. Most AI system designs treat multi-agent coordination as a solved problem. CooperBench suggests it’s one of the most unsolved problems in the field.