Anthropic built their most powerful AI — then decided not to release it

Source: Claude Mythos Preview System Card — Anthropic, April 2026

The part that bothers me is how calm the language sounds. It makes the risk feel distant, when the real problem is already sitting inside today’s systems and decisions.

In April 2026, Anthropic published a System Card for a model called Claude Mythos Preview. If you read it carefully, there’s a sentence buried in the abstract that’s easy to miss: “Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available.” Let that land for a moment. A frontier AI lab built their most capable model to date, evaluated it extensively, and chose not to release it to the public. Instead, they’re using it exclusively in a defensive cybersecurity programme with a small set of partners. The System Card is remarkable reading — not because it reveals catastrophic risks, but because of its honesty about the tradeoffs. The evaluations cover chemical and biological risk uplift, autonomy thresholds the point at which a model could meaningfully assist in AI R&D without human oversight, alignment properties, and something they call “model welfare” — assessments of the model’s functional emotional states. What I find most significant is what this decision signals about where we are in the capability-safety dynamic. For years, the concern in the AI safety community was that competitive pressure would prevent any lab from making a unilateral “slow down” decision.

In plain English, that is why the result matters beyond the chart. It changes where people should look, what they should question, and which comfortable assumption probably needs to be retired.

For leaders, the lesson is simple: if the risk timeline changes, the attention timeline has to change too. Waiting until everyone agrees it is urgent is usually how organisations arrive late.