← Back to all posts
Research Journey

GPT-4.5 was judged more human than actual humans. And most researchers still said AGI wasn't close.

June 5, 2026

Source: General Intelligence Verified — based on Nature Commentary, Feb 2026

In March 2025, blinded tests showed GPT-4.5 being judged as human 73% of the time — a rate higher than actual humans achieved in the same trials. Alan Turing asked in 1950 whether machines could display flexible cognitive competence sufficient to pass as human. The answer, as of 2025, is yes.

And yet, in surveys from the same year, 76% of AI researchers believed scaling was unlikely to produce AGI. That gap between what the systems are demonstrably doing and what the field believes about them is one of the most interesting phenomena I’ve encountered in this research.

Three explanations seem credible. First, definitions keep shifting: as soon as AI masters the current criterion for “general intelligence,” we invent a harder one. Second, there’s a legitimate fear of displacement that makes researchers reluctant to concede the point — not dishonestly, but through motivated reasoning. Third, there’s a real distinction between “passing as human” and “thinking like a human.”

I think all three are probably true simultaneously. And I think this matters practically because governance and safety frameworks are all calibrated to some mental model of where AI currently is. If that mental model is systematically lagging behind reality — if we’re underestimating capability even as it’s demonstrated in plain sight — then we’re designing guardrails for the wrong vehicle.