1. Fraud Detection Using a
Database Platform
Mike Blakley
Central
Carolina
Chapter of the
Association of
Certified Fraud February 23, 2009
Examiners
Fraud Detetcion using a database platform EZ-R Stats, LLC
2. Session objectives
Understand why and how
1.
Understand statistical basis for
2.
quantifying differences
Identify ten general tools and
3.
techniques
Understand how pattern
4.
detection fits in
Fraud detection using a database platform EZ-R Stats, LLC
3. Session agenda and timings
Managing the business risk of fraud (30 minutes)
Overview of statistical approach (10 min)
Discussion of databases (10 min)
Break (10 min)
Details of the approach (40 min)
Brief demo (5 min)
Open discussion and question and answer (15 min)
Fraud detection using a database platform EZ-R Stats, LLC
4. Handout (CD)
CD with articles and software
PowerPoint presentation
More info at www.ezrstats.com
Fraud detection using a database platform EZ-R Stats, LLC
5. Optional quiz
Test your understanding
Entirely optional
On home page under “events” – quiz
Results can be e-mailed
Fraud detection using a database platform EZ-R Stats, LLC
6. “Cockroach” theory of auditing
If you spot one roach….
Fraud detection using a database platform EZ-R Stats, LLC
7. “Cockroach” theory of auditing
There are probably 30
more that you don’t
see…
Fraud detection using a database platform EZ-R Stats, LLC
8. Statistics based “roach” hunting
Many frauds coulda/woulda/shoulda been detected with analytics
Fraud detection using a database platform EZ-R Stats, LLC
9. Overview
Fraud patterns detectable with
digital analysis
Basis for digital analysis
approach
Usage examples
Continuous monitoring
Business analytics
Fraud detection using a database platform EZ-R Stats, LLC
10. Objective 1
The Why and How
Three brief examples
ACFE/IIA/AICPA Guidance Paper
Practice Advisory 2320-1
Auditors “Top 10”
Process Overview
Who, What, Why, When & Where
Fraud detection using a database platform EZ-R Stats, LLC
11. Objective 1a
Example 1
Wake County Transportation Fraud
Supplier Kickback – School Bus
parts
$5 million
Jail sentences
Period of years
Fraud detection using a database platform EZ-R Stats, LLC
12. Objective 1a
Too little too late
Understaffed internal audit
Software not used
Data on multiple platforms
Transaction volumes large
Fraud detection using a database platform EZ-R Stats, LLC
13. Objective 1a
Preventable
Need structured, objective
approach
Let the data “talk to you”
Need efficient and effective
approach
Fraud detection using a database platform EZ-R Stats, LLC
14. Objective 1
Regression Analysis
Stepwise to find
relationships
Forwards
–
Backwards
–
Intervals
Confidence
–
Prediction
–
Fraud detection using a database platform EZ-R Stats, LLC
15. Objective 1
Data outliers
Sometimes an “out
and out Liar”
But how do you
detect it?
Fraud detection using a database platform EZ-R Stats, LLC
16. Objective 1
Data Outliers
Plot transportation costs vs.
number of buses
“Drill down” on costs
Preventive maintenance
–
Fuel
–
Inspection
–
Fraud detection using a database platform EZ-R Stats, LLC
17. Scatter plot with prediction and
confidence intervals
Fraud detection using a database platform EZ-R Stats, LLC
18. Objective 1a
Example 2
Cost of six types of AIDS drugs
Total Cost of AIDS Drugs
200
150 NDC1
Dollar Amount
NDC2
100
NDC3
50 NDC4
NDC5
0
NDC6
NDC1 NDC2 NDC3 NDC4 NDC5 NDC6
Drug Type
Fraud detection using a database platform EZ-R Stats, LLC
19. Objective 1
Medicare HIV Infusion Costs
CMS Report for 2005
South Florida - $2.2 Billion
Rest of the country combined -
$.1 Billion
Fraud detection using a database platform EZ-R Stats, LLC
20. Objective 1
Pareto Chart
Medicare HIV Infusion Costs - 2005 ($Billions)
data source: HHS CMS
120.0%
100.0%
Annual Medicare Costs
80.0%
Pct
60.0%
Cum Pct
40.0%
20.0%
0.0%
1
3
5
7
9
11
13
15
County
Fraud detection using a database platform EZ-R Stats, LLC
21. Objective 1a
Example 2
Typical Prescription Patterns
AIDS Drugs Prescription Patterns
60.0
NDC1
50.0
NDC2
40.0
Dollar Value
NDC3
30.0
NDC4
20.0
NDC5
10.0 NDC6
0.0
Prov 1 Prov 2 Prov 3 Prov 4 Prov 5 Prov 6
Prescriber
Fraud detection using a database platform EZ-R Stats, LLC
22. Objective 1a
Example 2
Prescriptions by Dr. X
Dr. X compared with Total Population
350
300
250
Dollar Amount
200
Population
150
100 Dr. X
50
0
NDC1 NDC2 NDC3 NDC4 NDC5 NDC6
Drug Type
Fraud detection using a database platform EZ-R Stats, LLC
23. Objective 1a
Example 2
Off-label use
Serostim
Treat wasting syndrome, side effect of
–
AIDS, OR
Used by body builders for recreational
–
purposes
One physician prescribed $11.5 million
–
worth (12% of the entire state)
Fraud detection using a database platform EZ-R Stats, LLC
24. Objective 1a
Example 3
Revenue trends
Overall Revenue Trend
1.2
1.15
Annual Billings
1.1
Overall
1.05
Linear (Overall)
1
0.95
0.9
2001 2002 2003
Calendar Year
Fraud detection using a database platform EZ-R Stats, LLC
25. Example 3 Objective 1a
Dental Billings
Rapid Increase in Revenues
5
4
Annual Billings
Billings A
($millions)
3
Billings B
2
Linear (Billings A)
1
0
2001 2002 2003
Calendar Year
Fraud detection using a database platform EZ-R Stats, LLC
26. Objective 1b
Guidance Paper
A proposed implementation approach
“Managing the Business Risk of Fraud: A
Practical Guide” http://tinyurl.com/3ldfza
Five Principles
Fraud Detection
Coordinated Investigation Approach
Fraud detection using a database platform EZ-R Stats, LLC
27. Objective 1b
Managing the Business Risk of
Fraud: A Practical Guide
ACFE, IIA and AICPA
Exposure draft issued
11/2007, final 5/2008
Section 4 – Fraud
Detection
Fraud detection using a database platform EZ-R Stats, LLC
28. Guidance Paper
Five Sections
Fraud Risk Governance
–
Fraud Risk Assessment
–
Fraud Prevention
–
Fraud Detection
–
Fraud Investigation and
–
corrective action
Fraud detection using a database platform EZ-R Stats, LLC
29. Risk Governance
Fraud risk management program
Written policy – management’s expectations
regarding managing fraud risk
Fraud detection using a database platform EZ-R Stats, LLC
30. Risk Assessment
Periodic review and assessment of potential
schemes and events
Need to mitigate risk
Fraud detection using a database platform EZ-R Stats, LLC
31. Fraud Prevention
Establish prevention techniques
Mitigate possible impact on the organization
Fraud detection using a database platform EZ-R Stats, LLC
32. Fraud Detection
Establish detection techniques for fraud
“Back stop” where preventive measures fail,
or
Unmitigated risks are realized
Fraud detection using a database platform EZ-R Stats, LLC
33. Fraud Investigation and Corrective
Action
Reporting process to solicit input on fraud
Coordinated approach to investigation
Use of corrective action
Fraud detection using a database platform EZ-R Stats, LLC
34. “60 Minutes” – “World of Trouble”
2/15/09 – Scott Pelley
Fraud Risk Governance – “one grand wink-wink,
–
nod-nod “
Fraud Risk Assessment - categorically false
–
Fraud Prevention – “my husband passed away”
–
Fraud Detection - We didn't know? Never saw one.
–
Fraud Investigation and corrective action - Pick-A-
–
Payment losses $36 billion
Fraud detection using a database platform EZ-R Stats, LLC
36. Objective 1b
Proactive Fraud Detection
Data Analysis to identify:
– Anomalies
– Trends
– Risk indicators
Fraud detection using a database platform EZ-R Stats, LLC
37. Fraud Detective Controls
Operate in the background
Not evident in everyday business
environment
These techniques usually –
Occur in ordinary course of business
–
Corroboration using external information
–
Automatically communicate deficiencies
–
Use results to enhance other controls
–
Fraud detection using a database platform EZ-R Stats, LLC
38. Examples of detective controls
Whistleblower hot-lines (DHHS and OSA
have them)
Process controls (Medicaid audits and edits)
Proactive fraud detection procedures
Data analysis
–
Continuous monitoring
–
Benford’s Law
–
Fraud detection using a database platform EZ-R Stats, LLC
39. Objective 1b
Specific Examples Cited
Journal entries – suspicious
transactions
Identification of relationships
Benford’s Law
Continuous monitoring
Fraud detection using a database platform EZ-R Stats, LLC
40. Objective 1b
Data Analysis enhances ability to
detect fraud
Identify hidden relationships
Identify suspicious transactions
Assess effectiveness of internal
controls
Monitor fraud threats
Analyze millions of transactions
Fraud detection using a database platform EZ-R Stats, LLC
41. Continuous Monitoring of Fraud
Detection
Organization should develop ongoing
monitoring and measurements
Establish measurement criteria (and
communicate to Board)
Measurable criteria include:
Fraud detection using a database platform EZ-R Stats, LLC
42. Measurable Criteria – number of
fraud allegations
fraud investigations resolved
Employees attending annual ethics course
Whistle blower allegations
Messages supporting ethical behavior
delivered by executives
Vendors signing ethical behavior standards
Fraud detection using a database platform EZ-R Stats, LLC
43. Management ownership of each
technique implemented
Each process owner should:
Evaluate effectiveness of technique regularly
–
Adjust technique as required
–
Document adjustments
–
Report modifications needed for techniques which
–
become less effective
Fraud detection using a database platform EZ-R Stats, LLC
44. Practice Advisory 2320-1
Analysis and Evaluation
International standards for the professional
practice of Internal Auditing
Analytical audit procedures
Efficient and effective
–
Useful in detecting
–
Differences that are not expected
Potential errors
Potential irregularities
Fraud detection using a database platform EZ-R Stats, LLC
45. Analytical Audit Procedures
May include
– Study of relationships
– Comparison of amounts with
similar information in the
organization
– Comparison of amounts with
similar information in the
industry
Fraud detection using a database platform EZ-R Stats, LLC
46. Analytical audit procedures
Performed using monetary amounts, physical
quantities, ratios or percentages
Ratio, trend and regression analysis
Period to period comparisons
Auditors should use analytical audit
procedures in planning the engagement
Fraud detection using a database platform EZ-R Stats, LLC
47. Factors to consider
Significance of the area being audited
Assessment of risk
Adequacy of system of internal control
Availability and reliability of information
Extent to which procedures provide support
for engagement results
Fraud detection using a database platform EZ-R Stats, LLC
48. Objective 1c
Peeling the Onion
Fraud Items
Possible Error Conditions
Population as Whole
Fraud detection using a database platform EZ-R Stats, LLC
49. Objective 1d
Fraud Pattern Detection
Round Numbers
Benford’s Law
Market Basket
Stratification Gaps
Target Group
Trend Line Univariate
Duplicates
Holiday
Day of Week
Fraud detection using a database platform EZ-R Stats, LLC
50. Objective 1e
Digital Analysis (5W)
Who
What
Why
Where
When
Fraud detection using a database platform EZ-R Stats, LLC
51. Objective 1e
Who Uses Digital Analysis
Traditionally, IT specialists
With appropriate tools, audit
generalists (CAATs)
Growing trend of business
analytics
Essential component of
continuous monitoring
Fraud detection using a database platform EZ-R Stats, LLC
52. Objective 1e
What - Digital Analysis
Using software to:
Classify
–
Quantify
–
Compare
–
Both numeric and non-numeric
data
Fraud detection using a database platform EZ-R Stats, LLC
53. Objective 1e
How - Assessing fraud risk
Basis is quantification
Software can do the “leg work”
Statistical measures of difference
– Chi square
– Kolmogorov-Smirnov
– D-statistic
Specific approaches
Fraud detection using a database platform EZ-R Stats, LLC
54. Objective 1e
Why - Advantages
Automated process
Handle large data populations
Objective, quantifiable metrics
Can be part of continuous monitoring
Can produce useful business analytics
100% testing is possible
Quantify risk
Repeatable process
Fraud detection using a database platform EZ-R Stats, LLC
55. Objective 1e
Why - Disadvantages
Costly (time and software costs)
Learning curve
Requires specialized knowledge
Fraud detection using a database platform EZ-R Stats, LLC
56. Objective 1e
When to Use Digital Analysis
Traditional – intermittent (one off)
Trend is to use it as often as possible
Continuous monitoring
Scheduled processing
Fraud detection using a database platform EZ-R Stats, LLC
57. Objective 1e
Where Is It Applicable?
Any organization with data in digital
format, and especially if:
Volumes are large
–
Data structures are complex
–
Potential for fraud exists
–
Fraud detection using a database platform EZ-R Stats, LLC
58. Objective 1
Objective 1 Summarized
Three brief examples
CFE Guidance Paper
“Top 10” Metrics
Process Overview
Who, What, Why, When & Where
Fraud detection using a database platform EZ-R Stats, LLC
59. Objective 1 - Summarized
Understand why and how
1.
Understand statistical basis for quantifying
2.
differences
Identify ten general tools and techniques
3.
Understand use of Excel
4.
How pattern detection fits in
5.
Next is the basis …
Fraud detection using a database platform EZ-R Stats, LLC
60. Objective 2
Basis for Pattern Detection
Analytical review
Isolate the “significant few”
Detection of errors
Quantified approach
Fraud detection using a database platform EZ-R Stats, LLC
61. Objective 2
Understanding the Basis
Quantified Approach
Population vs. Groups
Measuring the Difference
Stat 101 – Counts, Totals, Chi
Square and K-S
The metrics used
Fraud detection using a database platform EZ-R Stats, LLC
62. Objective 2a
Quantified Approach
Based on measureable
differences
Population vs. Group
“Shotgun” technique
Fraud detection using a database platform EZ-R Stats, LLC
63. Objective 2a
Detection of Fraud Characteristics
Something is different than expected
Fraud detection using a database platform EZ-R Stats, LLC
64. Objective 2b
Fraud patterns
Common theme – “something is
different”
Groups
Group pattern is different than
overall population
Fraud detection using a database platform EZ-R Stats, LLC
65. Objective 2c
Measurement Basis
Transaction
counts
Transaction
amounts
Fraud detection using a database platform EZ-R Stats, LLC
66. Objective 2d
A few words about statistics
(the “s” word)
Detailed knowledge of statistics not
necessary
Software packages do the “number-
crunching”
Statistics used only to highlight
potential errors/frauds
Not used for quantification
Fraud detection using a database platform EZ-R Stats, LLC
67. Objective 2d
How is digital analysis done?
Comparison of group with population as a
whole
Can be based on either counts or amounts
Difference is measured
Groups can then be ranked using a selected
measure
High difference = possible error/fraud
Fraud detection using a database platform EZ-R Stats, LLC
68. Demo in Excel of the process
Based roughly on the Wake County
Transportation fraud
Illustrates how the process works, using
Excel
Fraud detection using a database platform EZ-R Stats, LLC
69. Objective 2d
Histograms
Attributes tallied and categorized into “bins”
Counts or sums of amounts
Fraud detection using a database platform EZ-R Stats, LLC
71. Objective 2d
Histograms
Attributes tallied and categorized into “bins”
Counts or sums of amounts
Fraud detection using a database platform EZ-R Stats, LLC
72. Objective 2d
Compute Cumulative Amount for each
Count by Month
Cum Pct
80
120.0%
70
100.0%
60
50
80.0%
Count
40
60.0%
30
20 40.0%
10
20.0%
0
Au 07
Ju 7
Fe 7
Ju 7
Ap 7
07
M 07
O7
De 7
No 07
Se 7
0.0%
M7
-0
-0
0
0
0
n-0
0
r-0
l-
c-
p-
v-
n-
b-
ct-
g-
ay
ar
Ja
7
07
07
07
7
07
l-0
-0
M onth
p-
-
n-
-
ov
ay
ar
Ju
Ja
Se
M
N
M
Fraud detection using a database platform EZ-R Stats, LLC
73. Objective 2d
Are the histograms different?
Two statistical measures of
difference
Chi Squared (counts)
K-S (distribution)
Both yield a difference metric
Fraud detection using a database platform EZ-R Stats, LLC
74. Objective 2d
Chi Squared
Classic test on data in a table
Answers the question – are the
rows/columns different
Some limitations on when it can be
applied
Fraud detection using a database platform EZ-R Stats, LLC
75. Objective 2d
Chi Squared
Table of Counts
Degrees of Freedom
Chi Squared Value
P-statistic
Computationally intensive
Fraud detection using a database platform EZ-R Stats, LLC
76. Objective 2d
Kolmogorov-Smirnov
Two Russian
mathematicians
Comparison of distributions
Metric is the “d-statistic”
Fraud detection using a database platform EZ-R Stats, LLC
77. Objective 2d
How is K-S test done?
Four step process
For each cluster element
1.
determine percentage
Then calculate cumulative
2.
percentage
Compare the differences in
3.
cumulative percentages
Identify the largest difference
4.
Fraud detection using a database platform EZ-R Stats, LLC
78. Objective 2d - KS
Kolmogorov-Smirnov
Fraud detection using a database platform EZ-R Stats, LLC
79. Objective 2e
Classification by metrics
Stratification
Day of week
Happens on holiday
Round numbers
Variability
Benford’s Law
Trend lines
Relationships (market basket)
Gaps
Duplicates
Fraud detection using a database platform EZ-R Stats, LLC
80. Objective e
Auditor’s “Top 10” Metrics
Outliers / Variability
1.
Stratification
2.
Day of Week
3.
Round Numbers
4.
Made Up Numbers
5.
Market basket
6.
Trends
7.
Gaps
8.
Duplicates
9.
Dates
10.
Fraud detection using a database platform EZ-R Stats, LLC
81. Objective 2
Understanding the Basis
Quantified Approach
Population vs. Groups
Measuring the Difference
Stat 101 – Counts, Totals, Chi Square
and K-S
The metrics used
Fraud detection using a database platform EZ-R Stats, LLC
82. Objective 2 - Summarized
Understand why and how
1.
Understand statistical basis for quantifying
2.
differences
Identify ten general tools and techniques
3.
Understand examples done using Excel
4.
How pattern detection fits in
5.
Next are the metrics …
Fraud detection using a database platform EZ-R Stats, LLC
83. It’s that time!
Session Break!
Fraud detection using a database platform EZ-R Stats, LLC
84. Objective 3
The “Top 10” Metrics
Overview
Explain Each Metric
Examples of what it can detect
How to assess results
Fraud detection using a database platform EZ-R Stats, LLC
85. Objective 3
Trapping anomalies
Fraud detection using a database platform EZ-R Stats, LLC
86. Objective 3
Fraud Pattern Detection
Round Numbers
Benford’s Law
Market Basket
Stratification Gaps
Target Group
Trend Line Univariate
Duplicates
Holiday
Day of Week
Fraud detection using a database platform EZ-R Stats, LLC
87. 1 - Outliers
Outliers / Variability
Outliers are
amounts which
are significantly
different from the
rest of the
population
Fraud detection using a database platform EZ-R Stats, LLC
88. 1 - Outliers
Outliers / Variability
Charting (visual)
Software to analyze “z-scores”
Top and Bottom 10, 20 etc.
High and low variability (coefficient
of variation)
Fraud detection using a database platform EZ-R Stats, LLC
89. 1 - Outliers
Drill down to the group level
Basic statistics
– Minimum, maximum
and average
– Variability
Sort by statistic of interest
– Variability (coefficient
of variation)
– Maximum, etc.
Fraud detection using a database platform EZ-R Stats, LLC
90. 1 - Outliers
Example Results
Provider N Coeff Var
3478421 3,243 342.23
2356721 4,536 87.23
3546789 3,421 23.25
5463122 2,311 18.54
Two providers (3478421 and
2356721) had significantly more
variability in the amounts of their
claims than all the rest.
Fraud detection using a database platform EZ-R Stats, LLC
91. Next Metric
Outliers
1.
Stratification
2.
Day of Week
3.
Round Numbers
4.
Made Up Numbers
5.
Market basket
6.
Trends
7.
Gaps
8.
Duplicates
9.
Dates
10.
Fraud detection using a database platform EZ-R Stats, LLC
92. 2 - Stratification
Unusual stratification
patterns
Do you
know how
your data
looks?
Fraud detection using a database platform EZ-R Stats, LLC
93. 2 - Stratification
Stratification - How
Charting (visual)
Chi Squared
Kolmogorov-Smirnov
By groups
Fraud detection using a database platform EZ-R Stats, LLC
94. 2 – Stratification
Purpose / types of errors
Transactions out of the ordinary
“Up-coding” insurance claims
“Skewed” groupings
Based on either count or amount
Fraud detection using a database platform EZ-R Stats, LLC
95. 2 – Stratification
The process?
Stratify the entire population into
1.
“bins” specified by auditor
Same stratification on each group
2.
(e.g. vendor)
Compare the group tested to the
3.
population
Obtain measure of difference for each
4.
group
Sort descending on difference
5.
measure
Fraud detection using a database platform EZ-R Stats, LLC
96. 2 – Stratification
Units of Service Stratified -
Example Results
Provider N Chi Sq D-stat
2735211 6,011 7,453 0.8453
4562134 8,913 5,234 0.7453
4321089 3,410 342 0.5231
4237869 2,503 298 0.4632
Two providers (2735211 and
4562134) are shown to be much
different from the overall population
(as measured by Chi Square).
Fraud detection using a database platform EZ-R Stats, LLC
97. Next Metric
Outliers
1.
Stratification
2.
Day of Week
3.
Round Numbers
4.
Made Up Numbers
5.
Market basket
6.
Trends
7.
Gaps
8.
Duplicates
9.
Dates
10.
Fraud detection using a database platform EZ-R Stats, LLC
98. 3 – Day of Week
Day of Week
Activity on weekdays
Activity on weekends
Peak activity mid to late week
Fraud detection using a database platform EZ-R Stats, LLC
99. 3 – Day of Week
Purpose / Type of Errors
Identify unusually high/low
activity on one or more days of
week
Dentist who only handled
Medicaid on Tuesday
Office is empty on Friday
Fraud detection using a database platform EZ-R Stats, LLC
100. How it is done?
Programmatically check entire population
Obtain counts and sums by day of week
(1-7)
Prepare histogram
For each group do the same procedure
Compare the two histograms
Sort descending by metric (chi square/d-
stat)
Fraud detection using a database platform EZ-R Stats, LLC
101. 3 – Day of Week
Day of Week - Example Results
Provider N Chi Sq D-stat
2735211 5,404 12,435 0.9802
4562134 5,182 7,746 0.8472
4321089 5,162 87 0.321
4237869 7,905 56 0.2189
Provider 2735211 only provided
service for Medicaid on Tuesdays.
Provider 4562134 was closed on
Thursdays and Fridays.
Fraud detection using a database platform EZ-R Stats, LLC
102. Next Metric
Outliers
1.
Stratification
2.
Day of Week
3.
Round Numbers
4.
Made Up Numbers
5.
Market basket
6.
Trends
7.
Gaps
8.
Duplicates
9.
Dates
10.
Fraud detection using a database platform EZ-R Stats, LLC
103. 4 – Round Numbers
Round Numbers
It’s about….
Estimates!
Fraud detection using a database platform EZ-R Stats, LLC
104. 4 – Round Numbers
Purpose / Type of Errors
Isolate estimates
Highlight account numbers in
journal entries with round
numbers
Split purchases (“under the radar”)
Which groups have the most
estimates
Fraud detection using a database platform EZ-R Stats, LLC
105. 4 – Round Numbers
Round numbers
Classify population amounts
– $1,375.23 is not round
– $5,000 is a round number – type 3 (3
zeros)
– $10,200 is a round number type 2 (2
zeros)
Quantify expected vs. actual (d-statistic)
Generally represents an estimate
Journal entries
Fraud detection using a database platform EZ-R Stats, LLC
106. 4 – Round Numbers
Round Numbers in Journal
Entries - Example Results
Account N Chi Sq D-stat
2735211 4,136 54,637 0.9802
4562134 833 35,324 0.97023
4321089 8,318 768 0.321
4237869 9,549 546 0.2189
Two accounts, 2735211 and 4562134
have significantly more round number
postings than any other posting
account in the journal entries.
Fraud detection using a database platform EZ-R Stats, LLC
107. Next Metric
Outliers
1.
Stratification
2.
Day of Week
3.
Round Numbers
4.
Made Up Numbers
5.
Market basket
6.
Trends
7.
Gaps
8.
Duplicates
9.
Dates
10.
Fraud detection using a database platform EZ-R Stats, LLC
108. 5 – Made up numbers
Made up Numbers
Curb stoning
Imaginary numbers
Benford’s Law
Fraud detection using a database platform EZ-R Stats, LLC
109. 5 – Made Up Numbers
What can be detected
Made up numbers –
e.g. falsified inventory
counts, tax return
schedules
Fraud detection using a database platform EZ-R Stats, LLC
110. 5 – Made Up Numbers
Benford’s Law using Excel
Basic formula is “=log(1+(1/N))”
Workbook with formulae available at
http://tinyurl.com/4vmcfs
Obtain leading digits using “Left”
function, e.g. left(Cell,1)
Fraud detection using a database platform EZ-R Stats, LLC
111. 5 – Made Up Numbers
Made up numbers
Benford’s Law
Check Chi Square and d-statistic
First 1,2,3 digits
Last 1,2 digits
Second digit
Sources for more info
Fraud detection using a database platform EZ-R Stats, LLC
112. 5 – Made Up Numbers
How is it done?
Decide type of test – (first 1-3 digits, last
1-2 digit etc)
For each group, count number of
observations for each digit pattern
Prepare histogram
Based on total count, compute expected
values
For the group, compute Chi Square and
d-stat
Sort descending by metric (chi square/d-
stat)
Fraud detection using a database platform EZ-R Stats, LLC
113. 5 – Made Up Numbers
Invoice Amounts tested with
Benford’s law - Example Results
Store Hi Digit Chi Sq D-stat
324 79 5,234 0.9802
563 89 4,735 0.97023
432 23 476 0.321
217 74 312 0.2189
During tests of invoices by store, two
stores, 324 and 563 have significantly
more differences than any other store
as measured by Benford’s Law.
Fraud detection using a database platform EZ-R Stats, LLC
114. Next Metric
Outliers
1.
Stratification
2.
Day of Week
3.
Round Numbers
4.
Made Up Numbers
5.
Market basket
6.
Trends
7.
Gaps
8.
Duplicates
9.
Dates
10.
Fraud detection using a database platform EZ-R Stats, LLC
115. 6 – Market Basket
Market Basket
Medical “Ping ponging”
Pattern associations
Apriori program
References at end of slides
Apriori – Latin a (from) priori
(former)
Deduction from the known
Fraud detection using a database platform EZ-R Stats, LLC
116. 6 – Market basket
Purpose / Type of Errors
Unexpected patterns and
associations
Based on “market basket” concept
Unusual combinations of diagnosis
code on medical insurance claim
Fraud detection using a database platform EZ-R Stats, LLC
117. 6 – Market basket
Market Basket
JE Accounts
JE Approvals
Credit card fraud in Japan –
taxi and ATM
Fraud detection using a database platform EZ-R Stats, LLC
118. 6 – Market basket
How is it done?
First, identify groups, e.g. all
medical providers for a patient
Next, for each provider, assign a
unique integer value
Create a text file containing the
values
Run “apriori” analysis
Fraud detection using a database platform EZ-R Stats, LLC
119. 6 – Market basket
Apriori outputs
For each unique value, probability of
other values
If you see Dr. Jones, you will also
see Dr. Smith (80% probability)
If you see a JE to account ABC, there
will also an entry to account XYZ
(30%)
Fraud detection using a database platform EZ-R Stats, LLC
120. Next Metric
Outliers
1.
Stratification
2.
Day of Week
3.
Round Numbers
4.
Made Up Numbers
5.
Market basket
6.
Trends
7.
Gaps
8.
Duplicates
9.
Dates
10.
Fraud detection using a database platform EZ-R Stats, LLC
121. 7 - Trends
Trend Busters
Does the pattern make sense?
ACME Technology
30,000
25,000
20,000
Amount
Sales
15,000
Em ployee Count
10,000
5,000
0
7
8
7
M8
7
07
7
08
7
-0
-0
-0
-0
l-0
0
0
v-
n-
n-
p-
ay
ay
ar
ar
Ju
No
Ja
Ja
Se
M
M
M
Date
Fraud detection using a database platform EZ-R Stats, LLC
122. 7 – Trends
Trend Busters
Linear regression
Sales are up, but cost of goods sold is
down
“Spikes”
Fraud detection using a database platform EZ-R Stats, LLC
123. 7 – Trends
Purpose / Type of Errors
Identify trend lines, slopes,
etc.
Correlate trends
Identify anomalies
Key punch errors where
amount is order of
magnitude
Fraud detection using a database platform EZ-R Stats, LLC
124. 7 – Trends
Linear Regression
Test relationships (e.g.
invoice amount and sales
tax)
Perform multi-variable
analysis
Fraud detection using a database platform EZ-R Stats, LLC
125. 7 – Trends
How is it done?
Estimate linear trends using “best
fit”
Measure variability (standard
errors)
Measure slope
Sort descending by slope,
variability, etc.
Fraud detection using a database platform EZ-R Stats, LLC
126. 7 – Trends
Trend Lines by Account - Example
Results
Account N Slope Std Err
32451 18 1.230 0.87
43517 17 1.070 4.3
32451 27 1.023 0.85
43517 32 1.010 0.36
43870 23 0.340 2.36
54630 56 -0.560 1.89
Generally the trend is gently sloping
up, but two accounts (43870 and
54630) are different.
Fraud detection using a database platform EZ-R Stats, LLC
127. Scatter plot with prediction and
confidence intervals
Fraud detection using a database platform EZ-R Stats, LLC
128. Next Metric
Outliers
1.
Stratification
2.
Day of Week
3.
Round Numbers
4.
Made Up Numbers
5.
Market basket
6.
Trends
7.
Gaps
8.
Duplicates
9.
Dates
10.
Fraud detection using a database platform EZ-R Stats, LLC
129. 8 - Gaps
Numeric Sequence Gaps
What’s there is
interesting, what’s not
there is critical …
Fraud detection using a database platform EZ-R Stats, LLC
130. 8 – Gaps
Purpose / Type of Errors
Missing documents (sales, cash,
etc.)
Inventory losses (missing receiving
reports)
Items that “walked off”
Fraud detection using a database platform EZ-R Stats, LLC
131. 8 – Gaps
How is it done?
Check any sequence of numbers
supposed to be complete, e.g.
Cash receipts
Sales slips
Purchase orders
Fraud detection using a database platform EZ-R Stats, LLC
132. 8 – Gaps
Gaps Using Excel
Excel – sort and check
Excel formula
Sequential numbers and dates
Fraud detection using a database platform EZ-R Stats, LLC
133. 8 – Gaps
Gap Testing - Example Results
Start End Missing
10789 10791 1
12523 12526 2
17546 17548 1
Four check numbers are missing.
Fraud detection using a database platform EZ-R Stats, LLC
134. Next Metric
Outliers
1.
Stratification
2.
Day of Week
3.
Round Numbers
4.
Made Up Numbers
5.
Market basket
6.
Trends
7.
Gaps
8.
Duplicates
9.
Dates
10.
Fraud detection using a database platform EZ-R Stats, LLC
135. 9 - Duplicates
Duplicates
Why is there more
than one?
Same, Same, Same, and
Same, Same, Different
Fraud detection using a database platform EZ-R Stats, LLC
136. 9 – Duplicates
Two types of (related) tests
Same items – same vendor, same invoice
number, same invoice date, same amount
Different items – same employee name,
same city, different social security number
Fraud detection using a database platform EZ-R Stats, LLC
137. 9 - Duplicates
Duplicate Payments
High payback area
“Fuzzy” logic
Overriding software
controls
Fraud detection using a database platform EZ-R Stats, LLC
138. 9 - Duplicates
Fuzzy matching with
software
Levenshtein distance
Soundex
“Like” clause in SQL
Russian
Regular expression physicist
testing in SQL
Vendor/employee
situations
Fraud detection using a database platform EZ-R Stats, LLC
139. 9 - Duplicates
How is it done?
First, sort file in sequence for
testing
Compare items in consecutive
rows
Extract exceptions for follow-up
Fraud detection using a database platform EZ-R Stats, LLC
140. 9 - Duplicates
Possible Duplicates - Example Results
Vendor Invoice Date Invoice Count
Amount
10245 6/15/2007 3,544.78 4
10245 8/31/2007 2,010.37 2
17546 2/12/2007 1,500.00 2
Five invoices may be duplicates.
Fraud detection using a database platform EZ-R Stats, LLC
141. Next Metric
Outliers
1.
Stratification
2.
Day of Week
3.
Round Numbers
4.
Made Up Numbers
5.
Market basket
6.
Trends
7.
Gaps
8.
Duplicates
9.
Dates
10.
Fraud detection using a database platform EZ-R Stats, LLC
142. 10 - Dates
Date Checking
If we’re closed, why
is there …
Adjusting journal entry?
Receiving report?
Payment issued?
Fraud detection using a database platform EZ-R Stats, LLC
143. 10 – Dates
Holiday Date Testing
Red Flag indicator
Fraud detection using a database platform EZ-R Stats, LLC
144. 10 – Dates
Date Testing challenges
Difficult to determine
Floating holidays –
Friday, Saturday,
Sunday, Monday
Fraud detection using a database platform EZ-R Stats, LLC
145. 10 – Dates
Typical audit areas
Journal entries
Employee expense
reports
Business telephone calls
Invoices
Receiving reports
Purchase orders
Fraud detection using a database platform EZ-R Stats, LLC
146. 10 – Dates
Determination of Dates
Transactions when business is
closed
Federal Office of Budget
Management
An excellent fraud indicator in
some cases
Fraud detection using a database platform EZ-R Stats, LLC
147. 10 – Dates
Holiday Date Testing
Identifying holiday
dates:
– Error prone
– Tedious
U.S. only
Fraud detection using a database platform EZ-R Stats, LLC
148. 10 – Dates
Federal Holidays
Established by Law
Ten dates
Specific date (unless
weekend), OR
Floating holiday
Fraud detection using a database platform EZ-R Stats, LLC
149. 10 – Dates
Federal Holiday Schedule
Office of Personnel Management
Example of specific date – Independence
Day, July 4th (unless weekend)
Example of floating date – Martin Luther
King’s birthday (3rd Monday in January)
Floating – Thanksgiving – 4th Thursday in
November
Fraud detection using a database platform EZ-R Stats, LLC
150. 10 – Dates
How it is done?
Programmatically count holidays for
entire population
For each group, count holidays
Compare the two histograms (group
and population)
Sort descending by metric (chi
square/d-stat)
Fraud detection using a database platform EZ-R Stats, LLC
151. 10 – Dates
Holiday Counts - Example Results
Employee N Chi Sq D-stat
Number
10245 37 5,234 0.9802
32325 23 4,735 0.97023
17546 18 476 0.321
24135 34 312 0.2189
Two employees (10245 and 32325)
were “off the chart” in terms of
expense amounts incurred on a
Federal Holiday.
Fraud detection using a database platform EZ-R Stats, LLC
152. Objective 3
The “Top 10” Metrics
Overview
Explain Each Metric
Examples of what it can detect
How to assess results
Fraud detection using a database platform EZ-R Stats, LLC
153. Objective 3 - Summarized
Understand why and how
1.
Understand statistical basis for quantifying
2.
differences
Identify ten general tools and techniques
3.
Understand examples done using Excel
4.
How pattern detection fits in
5.
Next – using Excel …
Fraud detection using a database platform EZ-R Stats, LLC
154. Objective 4
Use of Excel
Built-in functions
Add-ins
Macros
Database access
Fraud detection using a database platform EZ-R Stats, LLC
155. Objective 4
Excel templates
Variety of tests
Round numbers
–
Benford’s Law
–
Outliers
–
Etc.
–
Fraud detection using a database platform EZ-R Stats, LLC
156. Objective 4
Excel – Univariate statistics
Work with Ranges
=sum, =average, =stdevp
=largest(Range,1),
=smallest(Range,1)
=min, =max, =count
Tools | Data Analysis | Descriptive
Statistics
Fraud detection using a database platform EZ-R Stats, LLC
157. Objective 4
Excel Histograms
Tools | Data Analysis | Histogram
Bin Range
Data Range
Fraud detection using a database platform EZ-R Stats, LLC
158. Objective 4
Excel Gaps testing
Sort by sequential value
=if(thiscell-lastcell <>
1,thiscell-lastcell,0)
Copy/paste special
Sort
Fraud detection using a database platform EZ-R Stats, LLC
159. Objective 4
Detecting duplicates with Excel
Sort by sort values
=if testing
=if(=and(thiscell=lastcell, etc.))
Fraud detection using a database platform EZ-R Stats, LLC
160. Objective 4
Performing audit tests with macros
Repeatable process
Audit standardization
Learning curve
Streamlining of tests
More efficient and effective
Examples -
http://ezrstats.com/Macros/home.html
Fraud detection using a database platform EZ-R Stats, LLC
161. Objective 4
Using database audit software
Many “built-in” functions right off the shelf
with SQL
Control totals
Exception identification
“Drill down”
Quantification
June 2008 article in the EDP Audit &
Control Journal (EDPACS) “SQL as an
audit tool”
http://ezrstats.com/doc/SQL_As_An_Audit_Tool.pdf
Fraud detection using a database platform EZ-R Stats, LLC
162. Objective 4
Use of Excel
Built-in functions
Add-ins
Macros
Database access
Fraud detection using a database platform EZ-R Stats, LLC
163. Objective 4 - Summarized
Understand why and how
1.
Understand statistical basis for quantifying
2.
differences
Identify ten general tools and techniques
3.
Understand examples done using Excel
4.
How Pattern Detection fits in
5.
Next – Fit …
Fraud detection using a database platform EZ-R Stats, LLC
164. Objective 5
How Pattern Detection Fits In
Business Analytics
Fraud Pattern Detection
Continuous monitoring
Fraud detection using a database platform EZ-R Stats, LLC
165. Objective 5
Where does Fraud Pattern Detection fit in?
Right in the middle
Business Analytics
Fraud Pattern Detection
Continuous fraud pattern
detection
Continuous Monitoring
Fraud detection using a database platform EZ-R Stats, LLC
166. Objective 5
Business Analytics
Fraud analytics -> business
analytics
Business analytics -> fraud
analytics
Fraud detection using a database platform EZ-R Stats, LLC
167. Objective 5
Role in Continuous Monitoring (CM)
Fraud analytics can feed (CM)
Continuous fraud pattern detection
Use output from CM to tune fraud
pattern detection
Fraud detection using a database platform EZ-R Stats, LLC
168. Objective 5 - Summarized
Understand why and how
1.
Understand statistical basis for quantifying
2.
differences
Identify ten general tools and techniques
3.
Understand use of Excel
4.
How pattern detection fits in
5.
Next: Links …
Fraud detection using a database platform EZ-R Stats, LLC
169. Links for more information
Kolmogorov-Smirnov
http://tinyurl.com/y49sec
Benford’s Law http://tinyurl.com/3qapzu
Chi Square tests http://tinyurl.com/43nkdh
Continuous monitoring
http://tinyurl.com/3pltdl
Fraud detection using a database platform EZ-R Stats, LLC
170. Market Basket
Apriori testing for “ping ponging”
Temple University
http://tinyurl.com/5vax7r
Apriori program (“open source”)
http://tinyurl.com/5qehd5
Article – “Medical ping ponging”
http://tinyurl.com/5pzbh4
Fraud detection using a database platform EZ-R Stats, LLC
171. Excel macros used in auditing
Excel as an audit software
http://tinyurl.com/6h3ye7
Selected macros -
http://ezrstats.com/Macros/home.html
Spreadsheets forever -
http://tinyurl.com/5ppl7t
Fraud detection using a database platform EZ-R Stats, LLC