|
PATTERN CHASER
Audit Analytics for Internal Audit in Financial Services
|
|
|
|
This Fraud Test Is a Comfort Blanket
It’s built into most audit analytics tools. Taught in countless fraud courses. Elegant math. Poor fraud detection.
|
|
Hey Reader 👋
There’s a mathematical formula discovered by accident in a library 145 years ago.
It’s built into most major audit analytics tools. It’s taught in countless fraud examination courses. It looks elegant and scientific. And the academic evidence is damning.
It almost never catches fraud.
Worse: when researchers tested clean, real-world datasets that clearly follow Benford’s distribution, the standard statistical test rejected them anyway. The method meant to catch anomalies creates false alarms on legitimate data.
In today’s issue:
| ↳ |
The formula, its strange origin, and why it fails at real fraud rates |
|
| ↳ |
Three audit scenarios that guarantee false positives |
|
| ↳ |
What actually works instead |
|
Examples in this newsletter are composites drawn from industry experience, peer conversations and published research. Not attributable to any single organisation.
|
|
|
|
DEEP DIVE
The Mathematical Comfort Blanket
In 1881, astronomer Simon Newcomb noticed something odd in his library. The early pages of logarithm books were worn and dirty. The later pages were clean. People were looking up numbers starting with 1 far more often than numbers starting with 9.
Nobody cared. His observation was ignored for 57 years until physicist Frank Benford rediscovered it in 1938, applied it to 20,000+ observations across 20 different datasets, and published the finding. The formula now bears his name, not Newcomb’s. Stigler’s Law strikes again: no scientific discovery is named after its original discoverer.
The pattern is real. In naturally occurring datasets that span several orders of magnitude, the digit 1 appears as the first digit about 30.1% of the time. Digit 9 appears only 4.6% of the time. It works for river lengths, population sizes, stock prices, and street addresses.
And auditors weaponised it. The theory: if someone fabricates numbers, they won’t follow Benford’s distribution. Deviations from the expected pattern signal potential fraud. It’s elegant. It’s mathematical. And it has three fatal problems.
|
|
Problem 1
Detection fails below 10-12% fraud rate
In 2008, researcher Mark Linville ran controlled experiments. Graduate students seeded fraudulent invoices into clean datasets.
The result? Digit analysis failed to detect fraud in every single trial when fraudulent data comprised less than 12.5% of the population.
Think about what that means. If a fraudster is careful enough to keep their fake invoices under 10% of total volume, Benford won’t see them. And most real fraud operates well below that threshold. The test is structurally under-powered for actual audit environments.
|
|
|
Problem 2
Chi-squared rejects clean data at scale
The standard statistical test for Benford conformity is the chi-squared test. Think of it as the pass/fail gate: it compares your observed digit distribution against the expected Benford distribution and tells you whether the difference is statistically significant.
The problem? Chi-squared is sensitive to sample size, not fraud. On any sufficiently large dataset, it will reject Benford conformity even when the data is completely clean.
Kossovsky demonstrated this in 2021 by testing six real-world datasets with clear Benford conformity. The chi-square test rejected all six. Not because they deviate from the expected distribution, but because the test is sample-size sensitive. At high transaction volumes, it flags conformity failures that are too small to matter. The test stops measuring fraud risk and starts measuring how much data you have.
|
|
|
Problem 3
Policy-driven patterns break the assumption
Benford’s Law assumes numbers arise from natural, multiplicative processes. Most audit populations don’t. Common patterns that violate Benford’s assumptions:
| ↳ |
Approval thresholds: expenses capped at £10,000 create heavy digit-9 representation |
|
| ↳ |
Psychological pricing: retail transactions ending in .99 or .95 |
|
| ↳ |
Round numbers: budgets, salaries, and contract values set at £50,000 or £100,000 |
|
| ↳ |
Narrow ranges: invoice populations that don’t span multiple orders of magnitude |
|
These aren’t edge cases. They’re the norm. And every one of them will produce Benford anomalies that have nothing to do with fraud.
|
|
The $1,153.35 Invoice Spike
A fraud examiner ran Benford analysis on an accounts payable population. The test flagged a spike at first-two-digits “11” as anomalous. Investigation revealed 2,263 invoices at exactly $1,153.35.
The cause? A bulk purchasing price for items bought six times a day from two legitimate vendors. Completely normal business activity. Days of investigation time consumed by a mathematical false positive.
|
|
The ACFE’s Report to the Nations consistently finds that around 43% of occupational fraud is detected by tips. Three times more than any other method.
Proactive data monitoring and analysis is associated with significantly lower fraud losses.
Benford’s Law is not the analytics. It’s the warm-up act that keeps the real work from happening.
|
|
|
|
|
WHAT’S POSSIBLE
The 3-Step Alternative
Instead of leading with Benford’s Law and hoping it catches something, flip the sequence.
Step 1: Profile your population
Before running any test, understand what normal looks like. Calculate the average invoice value by vendor over the past 12 months. Flag any vendor more than 2 standard deviations above their historical mean. Do the same by employee, by cost centre, by transaction type. This surfaces anomalies that matter, not anomalies that exist because someone set a policy threshold at £10,000.
Step 2: Run targeted tests
| ↳ |
Duplicate detection: same amount, same vendor, same date, different invoice numbers |
|
| ↳ |
Gap analysis: missing sequence numbers in controlled document series |
|
| ↳ |
Threshold analysis: transactions clustering just below approval limits |
|
Step 3: If you still want Benford, use MAD
If your population genuinely qualifies for Benford analysis (spans several orders of magnitude, no policy constraints, at least 1,000 records), use the Mean Absolute Deviation (MAD) test instead of chi-squared. MAD is less sensitive to sample size and produces fewer false positives. But even then, treat the results as a conversation starter, not a finding.
The profession has better tools. Duplicate detection, gap analysis, threshold clustering. These target specific fraud schemes. They produce actionable results. And they don’t waste three days chasing a legitimate bulk purchase order.
|
|
|
|
BEST LINKS
Worth your time.
|
🧠 THE MIND
Confirmation Bias: Why We Keep Using What Doesn’t Work. Farnam Street on the mental trap: we favour evidence that supports what we already believe. Darwin’s rule: write it down immediately when you find evidence against your position. Munger’s rule: never hold an opinion unless you know the other side better than they do.
|
|
|
|
|
THAT’S A WRAP
3 Ways I Can Help
|
1.
|
The Analytics Reality Check
|
15 minutes. No preparation needed. No commitment after. Find out exactly where your function stands.
|
|
|
2.
|
The Audit Analytics Programme
|
For Heads of Internal Audit ready to build what their board expects. 6 weeks. Structured. Built from 9 years at FTSE 100 level.
|
|
A senior peer in your corner. Two sessions a month, async access for decisions that can’t wait.
|
|
|
|
|
SHARE
Know a colleague who’d find this useful? Send it on.
|
|
See you next week,
Tony Abraham
Data Science & AI for Internal Audit
|
|
How did you like today’s issue?
|
|
|