Explainers · Hallucination

A Visual Primer

Confident. Fluent.
Wrong.

Why language models invent things — and why the inventions look so real. Seven stages.

Scroll

It is always predicting.

A language model writes by predicting the next likely word, over and over. There is no lookup step, no database of true facts, no flag that distinguishes remembering from inventing.

WHAT IT LOOKS LIKE

looks up the answer

There is no filing cabinet. Nothing is retrieved, checked or cited from a record.

WHAT IT DOES

themostlikelyword

One roll of the dice after another — each word chosen because it fits, not because it is true.

Fluency is cheap. Truth is not.

The shape of a legal citation — party names, brackets, year, court, number — is an easy pattern to learn from millions of examples. Whether a specific case exists is not a pattern; it’s a fact. So the model can produce perfectly formatted citations for cases that were never decided.

Whitmore v Sefton MBCParties

[2018]Year

EWCA CivCourt

1104Number

Every part of this is plausible. The case does not exist — we invented it for this page.

Real or invented?

Six citations. Three are real, verified authorities; three we invented for this page. Can you tell which is which?

Citation 1 of 6 · Score 0

Donoghue v Stevenson [1932] AC 562

It cannot mark its own homework.

When you ask “are you sure that case is real?”, the model answers with the same pattern-matching that produced the citation. A fabricated case that looks like real cases will look real to the model too. Verification must come from outside: BAILII, the National Archives, Westlaw.

prompt

model

citation

“check it”

“yes, looks right”

same model

a closed loop cannot detect its own inventions

The case law on fake case law.

This has stopped being hypothetical.

In Ayinde v London Borough of Haringey [2025] EWHC 1383 (Admin), the Divisional Court dealt with fabricated citations put before English courts and made clear that lawyers bear personal professional responsibility for what they file — AI use is no excuse. Courts use the Hamid jurisdiction to require lawyers to explain themselves.

The consequences are professional — embarrassment, referral to regulators, costs — not technical.

Newer models are better. None are safe.

Newer models hallucinate less and increasingly say “I’m not sure” rather than inventing — abstention is improving measurably with each generation. But the failure mode is structural (stage 01): as long as the model writes by prediction, invention remains possible. Treat improvement as a lower error rate, not a guarantee.

older models

current models

—

zero

the last bar doesn’t exist

Ground, then verify.

Ground

Give the model the authorities and make it quote the passage it relies on verbatim. If it cannot quote it, it may not cite it.

Verify

Check every citation against the National Archives or BAILII before it goes in a document. Then check the quotation itself: open the judgment and search the text for the exact quoted words (Ctrl+F — or grep, if you speak Unix). If the words aren’t there, the quote is invented — however real the case may be.

The two steps interlock: a verbatim quote is a searchable quote. That is why the instruction below demands one.

For every authority you cite, quote the passage you rely on verbatim and list each neutral citation at the end so I can verify it before filing. If you cannot quote it, do not cite it.

Invention is a feature. Verification is yours.

The model will never warn you that it’s inventing — it doesn’t know. The safeguard is a workflow, not a setting. Next: how giving the model your documents changes everything.

Next: Grounding — Giving the Model Your Documents →

Confident. Fluent.Wrong.