Data Con LA 2020
Description
The recent proliferation of predictive analytics within companies is of limited benefit unless these companies learn to measure, understand, and embrace a critical concept: error. There is no such thing as a perfect predictive model and all tools using any sort of predictive model will have error. Despite being relatively easy to implement and understand, consistent error measurement continues to be underutilized or even completely avoided. In this session we will discuss
*Why embracing error is so valuable to companies.
*We will then review basic ways to measure error in commonly used models and in data source systems such as CRMs and ERPs.
*Most importantly, we will review some ways to approach company leadership with the concept of error.
Speaker
Ryan Johnson, GoGuardian, Director of Science and Analytics
8. Prediction Error
The difference between what we expected and what we observed
• The deep learning model predicted 10,000 clicks but we got
10,478
• The CNN predicted the directory contained 10 images of
hotdogs but it actually contained 0
• The recommender system estimated the user would give the
movie 5 stars but they gave it 1
9. Prediction Error?
The difference between what we expected and what we observed
• I told ordering we’d sell 1000 units of the new widget but we
only sold 670
• We expected employee satisfaction to decrease this quarter but
it went up
• The sales rep anticipated $500K in revenue from this account
but actually got $1.5M!
• There are 100 rows in the table that are missing the required ID
field
11. Malcolm Baldrige National Quality Award 1988 Recipient Motorola Inc.
KEY QUALITY INITIATIVES
To accomplish its quality and total customer satisfaction goals, Motorola
concentrates on several key operational initiatives. At the top of the list is
"Six Sigma Quality," a statistical measure of variation from a desired
result.
- NIST.gov
Process Improvement
20. We expected $500K in revenue
from this account but saw
$1.5M!
Absolute error = = $1,000,000| ̂y − y|
21. We expected $500K in revenue
from this account but saw
$1.5M!
Absolute percent error = = 0.67|
̂y − y
y
|
22. We expected $500K in revenue
from this account but saw $1.5M!
Absolute percent error = 0.67
We expected to sell 1000 units but
sold 670
Absolute percent error = 0.33
23. Repeated Measures of Error
Central tendency and dispersion of the errors
Account Revenue Error
AbsolutePercentError
0
9.5
19
28.5
38
April May June July Aug Sept
25. Repeated Measures of Error
Central tendency and dispersion of the errors
Mean Absolute Error =
Mean Absolute Percent Error =
Median Absolute Percent Error = where
Standard Deviation
Interquartile Range (IQR)
1
n
n
∑
i=1
̂yi − yi
1
n
n
∑
i=1
|
̂yi − yi
yi
|
median(p1, p2, . . . , pn) pi = |
̂yi − yi
yi
|
26. Repeated Measures of Error
Central tendency and spread
Account Revenue Error
AbsolutePercentError
0
9.5
19
28.5
38
April May June July Aug Sept
MAPE = 30.167%
IQR = 2.25%
27. Repeated Measures of Error
Central tendency and spread
Account Revenue Estimates
USdollars
$0
$5,000
$10,000
$15,000
$20,000
Oct Nov Dec
28. Repeated Measures of Error
Central tendency and spread
Account Revenue Estimates
USdollars
$0
$5,000
$10,000
$15,000
$20,000
Oct Nov Dec
30% error range applied to each estimate
29. Repeated Measures of Error
Central tendency and spread
MdAPE = 28.4%
Range = 21.33 46.61
What are these products?
30. Repeated Measures of Error
Central tendency and dispersion of the errors
Mean Absolute Error =
Mean Absolute Percent Error =
Median Absolute Percent Error = where
Standard Deviation
Interquartile Range (IQR)
1
n
n
∑
i=1
̂yi − yi
1
n
n
∑
i=1
|
̂yi − yi
yi
|
median(p1, p2, . . . , pn) pi = |
̂yi − yi
yi
|
43. Opportunities for Error (Yay!) in Marketing
We want to undertake new brand awareness initiatives this
quarter.
What do we expect our brand awareness to be after these initiatives?
There are a lot of factors to consider so it is not clear.
It sounds like we don’t have as much information as we’d like to make an
estimate. That’s understandable. One of the most useful pieces of information
for making an estimate is how far off we were with our last estimate. So in
order to break this cycle I suggest we begin collecting this information. How
can we ensure the marketing department is comfortable making a really rough
initial estimate?
Example dialogue
44. Opportunities for Error (Yay!) in Marketing
Throughout the Marketing Funnel
• Impressions, leads, and other volume measures
• Click-thru, lead conversion or other rate measures
• “What is the error rate on our lead scoring algorithm?”
• Cost per lead, return on marketing spend
• “Can the vendor provided an expected cost per lead?”
• “What change do we expect in marketing return after this new
initiative?”
45. Opportunities for Error (Yay!) in Sales
Our new sales script drove trial to purchase rates.
It sure did! The analysis showed a 5% increase that doesn’t appear to be due
to random chance. We anticipate a 2% increase giving us an absolute error of
3%.
Who cares about the error!? We increased purchase rates,
right?
I agree it’s a fantastic improvement. That effort should be applauded. We want
to note the error rate so next team we take on a similar project we will be better
able to anticipate the outcome and the downstream impacts. For example, our
product team is seeing an increase in shipping delays due to the increased
demand.
Example dialogue
46. Opportunities for Error (Yay!) in Sales
Throughout the Sales Process
• Opportunities, deals, trials, contracts and other volume
measures
• Trial to purchase, Opp conversion, or other rate measures
• Average order size, average contract length, products per
order, days to close
• “By expanding the team do we think days to close will
decrease or is this necessary just to maintain?”
47. Opportunities for Error (Yay!) in HR
Our goals this quarter are to boost employee satisfaction as
well as reduce time to placement in recruiting.
Sounds great! Last survey our employee sat. was at 75 and average time to
placement stands at 65 days. What do we estimate these will be once the
initiatives are complete?
Right now the goal is just to produce change in the right
direction. We’ll evaluate how things are going as we get
further into the quarter.
I think it’s great that we want to change these measures and I fully agree we
should continuously check-in. To make sure our check-ins help us reach useful
conclusions it’s important to clearly lay out what we expect to happen.
Example dialogue
48. Opportunities for Error (Yay!) in HR
Our project this quarter focused on reducing “days to fill” error in our estimates for
engineering requisitions. We took the following steps that begin on July 3rd… I’ll turn it
over to our analyst to discuss the results.
For the past year we’ve had a mean error of 34 days when estimating “days to fill”
for these roles. After these new efforts we’ve seen a mean error of 12 days for the
10 engineers we’ve hired. Analysis thus far suggests this reduction is unlikely to be
due to chance. It’s a great improvement.
Based on this outcome, we’ve begun implementing similar changes to the hiring process for
all roles. With this improved ability to estimate we also want to revisit our hiring plan for the
next quarter. It appears some of the open headcount will come much too late to help with
busy season.
Example dialogue
50. Error is the difference between
our expectations and
observations
Always approach with a sincere desire to improve the company
51. Error is simple to measure
Get forward momentum by avoiding complicated measures for now
52. Error is a huge opportunity to
improve… that humans aren’t
great at using
Make it easier for by carefully tracking error across the company
53. Error tracking is a tool for
analysis.
Analytics itself is not a tool.
Analytics is a way of thinking.
54. T.S. Eliot
We shall not cease from exploration
And the end of all our exploration
Will be to arrive where we started
And know the place for the first time.
55. GoGuardian Science and Analytics
Thanks to these incredible explorers
Bianca Jacobs, Stephanie Dang, Mike Frantz,
Greg Johnson, Kevin Wecht, Yola Katsargyri,
Tony Woods, Harrison Mamin, Nicole Jeong,
Rosie Abe, Manoj Rawat
56. Opportunities for Error Everywhere!
Names may vary
Engineering Product/Design Accounting/Finance IT/Support
Expected # of bugs/defects
Expected NPS
improvements
Average debtor days Submitted ticket volume
Expected days to complete
Expected product demand
or daily active users
# of current accounts
receivable
Ticket completion time
Uptime expectations
Anticipated inventory
turnover
Accounts payable process
cost
Costs of goods sold
estimates
Time to first draft Budget variance
Rework time Payment error rate
A good exercise might be taking this table and seeing how many of the proposed measures
are actually tracked at your company. Then ask how many of them have error metrics that are
also tracked?