← Back to all posts
Research Journey

What happens when AI writes accounting research — and when it hallucinates results

June 6, 2026

Source: Ciconte, Rozario & Urcan (2026) — “Using AI to Identify Exogenous Shocks and Conduct Archival Accounting Research”

This paper is an experiment in using AI as a research collaborator — and the results are a useful map of where AI helps, where it doesn’t, and where it actively misleads.

The researchers used AI to do three things: identify exogenous shocks in U.S. securities regulations suitable for causal identification, test the effect of those shocks on voluntary disclosure behaviour, and write a draft academic paper based on the findings. They then ran the same experiment for non-U.S. securities regulations.

For U.S. regulations, the AI performed reasonably well at identifying shocks and providing a useful starting point for the analysis. The resulting draft paper wasn’t publication-ready — “not ready to be submitted to top accounting journals,” the authors note directly — but it constituted a genuine starting point that saved meaningful time.

For non-U.S. regulations, the result was alarming. The AI produced multiple professional-looking papers with spurious results and unsubstantiated economic arguments. The papers looked like research. They had the right structure, the right language, the right level of academic tone. But the economic reasoning didn’t hold up, and in some cases the causal claims were unsupported by the data. The AI appeared to construct plausible-sounding narratives around whatever patterns emerged, rather than correctly identifying where causal identification failed.

This distinction — useful assistance for familiar territory, confident fabrication for unfamiliar territory — is one of the most important characteristics of current AI systems. The model produces similar-looking output regardless of whether it has genuine understanding or is pattern-matching to what an answer should look like. The quality difference is in the content, not the presentation.

For researchers: AI is genuinely useful for tasks where there’s a large body of well-understood prior work to draw on, where errors are easily checked, and where the researcher retains close oversight. For tasks that require deep domain judgment, novel causal reasoning, or identification of whether evidence actually supports a claim — the model needs careful supervision, and its confident presentation of results is not a reliable signal of their validity.

This is a finding that applies well beyond accounting research.