TDWI Munich 2019
What does it take to operationalize machine learning and AI in an enterprise setting?
Machine learning in an enterprise setting is difficult, but it seems easy. All you need is some smart people, some tools, and some data. It’s a long way from the environment needed to build ML applications to the environment to run them in an enterprise.
Most of what we know about production ML and AI come from the world of web and digital startups and consumer services, where ML is a core part of the services they provide. These companies have fewer constraints than most enterprises do.
This session describes the nature of ML and AI applications and the overall environment they operate in, explains some important concepts about production operations, and offers some observations and advice for anyone trying to build and deploy such systems.
2. Copyright Third Nature, Inc.
PSA: The reality behind most ML/AI production case studies
TL;DR Embedded ML and AI is
harder than people realize.
Most companies will not be
able to do it at the current state
of IT and market maturity.
3. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Overview
We won’t:
▪ Talk much about development and building of
models, since this is about operations
▪ Talk about technology or techniques, since this is
about operations
We will:
▪ Talk a lot about concepts, observations, and practices
4. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
What does analytics management care about?
There is a key stakeholder:
analytics management - the
CAO, CDO, VP of analytics, aka
“your boss” if you’re a data
scientist.
The perspective and problems
of the person responsible for
oversight of the team and
efforts is across the
organization and across
multiple projects
9. Copyright Third Nature, Inc.
Scientific
reproducibility
Can you get the same
results given the same
starting conditions?
▪ Experimental replication
may not be the same for
many reasons
▪ Detailed statistical
analysis is needed
▪ And critical assessment
of experiments
10. Copyright Third Nature, Inc.
Data science reproducibility
Can you get the same
results on the same
data?
▪ Direct replication is
expected
▪ Assumption is that
same input = same
output
▪ You can have this with
an unexplainable box
▪ But there are also
confounding factors
11. Copyright Third Nature, Inc.
Interpretability and reproducibility are driven by trust,
which only matters when there is enough risk
Regulation and compliance
Material decisions
Big penalties
Complication: the cost of false
positives and false negatives
are usually different, so model
characteristics matter here.
Cost of error
Frequency of
decision
Low High
High
Don’t care
Oh crap
12. Copyright Third Nature, Inc.
The real questions
Can I support this answer at a later date?
▪ Do I care?
• Is the risk (cost) worth worrying about?
• Is the cost of reproducibility less than the risk and cost?
▪ Do I only need to justify it?
• interpretability and trust may be enough
▪ Do I need to reproduce the results?
• May not need interpretability
• Need a lot of other things
13. Copyright Third Nature, Inc.
The real need is trust. Our trust is
based on all the elements that
are involved, not just the model.
The higher the stakes the more
you must think about all the ways
it could be wrong, because we all
want to be right.
Reliability and robustness of the
technology environment is as
important as the model.
Reproducibility is everyone’s
problem – it is an operational
concern.
15. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Starting with the process everyone does: building stuff
Slide 15
Define the
business problem
Translate the
problem into an
analytic context
Select appropriate
data
Learn the data
Create a model set
Fix problems with
data
Transform data
Build models
Assess models
Deploy models
Assess results
% of time spent
70% 30%
Source: Michael Berry, Data Miners Inc.
16. 16
"Always design a thing by considering it in its next larger context - a
chair in a room, a room in a house, a house in an environment, an
environment in a city plan." – Eliel Saarinen
17. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Expanding the perspective beyond the initial bit
• There are upstream parts to the development process:
collecting and managing data, both for dev and in prod.
• There are downstream parts, in deployment and then in
production operation.
• Data and artifacts are exchanged as part of the workflows
Collect DeployBuild Operate
19. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
The operation workflow itself is complicated
Sense Process Interact Learn
Learning: could be human methods (manual adjustment) or
machine methods (e.g. reinforcement learning), which change the
sensing and processing.
Collect inputs
AKA clean data
“reasoning”
Execute model
to select and
post actions
“acting”
Perform the
desired actions
Measure,
compare,
adjust
Inputs Actions Obs
Operate
20. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Feedback requires lots of data that you must record
Sense Process Interact Learn
Data volumes explode with all the telemetry:
1 execution = raw data in, inputs, the action, each metric used
(expected values), execution log, metric data (actuals), deltas,
model changes, technical resource information
Record the
inputs
Record the action,
the expected
outcome (tracking
metrics and OEC
Record the
execution.
Record the
metric deltas
Record the
changes
Inputs Actions Obs
21. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Criteria for models: not just accuracy!
You must track performance in development and production relative
to the metrics that are most important, in addition to the OEC.
Predictive
Accuracy
Highly accurate
model
SpeedFast processing
time for training
and test, OR for
execution
Simplicity
Resulting model has few
parameters and is easy
to monitor and explain
Robustness
Results are
stable over time
Scalability
Model can handle growing
data volume and/or high
concurrency
Interpretability
Easy to
understand
model results
22. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
ML is not like code: it can get better in production
ML in production usually starts out at the expected level
and improves over time, if you are doing it right.
Continuous improvement is the norm.
ML always goes wrong at some point. The best way to
protect against that is to constantly monitor and test
models on real production data.
e.g. sometimes your training data is not representative in
unexpected ways.
Excessive Invariance Causes Adversarial Vulnerability
https://openreview.net/forum?id=BkfbpsAcF7
26. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Beyond toy examples, this problem matters
The AI edge case error problems
will limit AI applicability until they
are solved (don’t hold your breath).
The uncertainty challenge means:
▪ Calculate error costs and apply them
to your model before (and after)
▪ Use AI for low cost errors or HitL
27. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
You need to protect against model execution problems
Sense Process Interact Learn
You have to track the actions / executions and their results,
including the OEC, in real time, to protect against failures.
This adds monitors and circuit breakers.
Inputs Actions Obs
MonitorBreaker
28. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
ML Principle: CACE, Change Anything Change Everything
In embedded or
autonomous ML
everything is
connected.
Events usually
happen in real time.
ML is very sensitive
to context and input.
“So what if I
changed NumPy
in dev?”
29. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
“A production ML system is never all green”
Much of the time, the ML app is a distributed system.
Distributed systems are hard.
Monolithic architectures are great if you can use them.
This is fine.
Everything is fine
30. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Not just protection - diagnostics
Sense Process Interact Learn
You need telemetry about the entire environment for monitoring,
but you also need it for diagnostics.
This means you need to think about observability.
Inputs Actions Obs
MonitorBreaker
31. Copyright Third Nature, Inc.
Ask yourself “What could go wrong?” and you’ll
probably be right.
32. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Therefore: Test in Production?
Bad right? Goes against everything IT says about testing
▪ This is a culture change for IT, and a hard one to make.
But production is always-on, real time. Conditions are
constantly changing. You can’t replicate the environment.
If ML is highly sensitive to conditions, and you can’t
replicate the conditions exactly, then… test in production.
▪ On real data
▪ With real network and server configurations
▪ And real concurrency
But also:
▪ Keep testing in dev/QA and CI/CD environments.
33. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
ML is not like code: Monitoring in production
Unlike BI, ML has different metrics for “correct”
The metrics are relative and can change over time
You must monitor performance closely, which is
like doing BI on your AI.
“observability”, because a problem may not be
the model but the data, or the infrastructure.
▪ Reduce the time to diagnose, rather than
emphasizing the prevention of coding errors
37. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
How do you get the full picture?
What does a person do when there’s a problem?
One report from each application isn’t a sustainable answer
?
38. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
The goal is to make decisions, not get reports
This is the “decision support” model
Analyze
Decide
Act
KPIs, metrics
39. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
We had batch (and stream) models decades ago
Analyze
Decide
Act
e.g. segmentation, NBO queue, churn, fraud
Usually batch, ran from the DW (but probably not on it),
resulting data loaded into the DW for use
Someone oversees and acts on the information
Analyze
KPIs, metrics
40. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Applying analytics within a process context
Analyze
Decide Act
e.g. purchasing changes, upsell/cross-sell recommendations
Machines gain agency, humans lose it; “act” is curtailed
This is the human-in-the-loop model
41. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
When there’s a problem, the fix is a message
The model’s results should be visible via the KPIs and metrics
Act: People call people to see what’s happening
KPIs, metrics
42. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Somebody built the models – the data scientist
More communication is required
The data scientist needs to observe and
change system behaviors
43. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Enter the black boxes – the “autonomous” model
45. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Black boxes beget gray boxes because of speed
46. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Three ML deployment categories
Decision support
(aka BI) is final
arbiter of success.
Autonomous
Human in
the loop
Decision
support
47. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Distributed Agency
Today’s application
model of ML is usually
embedded in a fixed
central system.
Governing a model and
it’s application is more
complex when the ML
system is not controlled
via a central service.
This level of autonomy
is still in its infancy. Cf
Roomba
48. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Three ML deployment categories
Autonomous
Human in
the loop
Decision
support
All of these
independent /
separate
architectures
are dependent
on some level
of shared
context.
That means
shared
operational
data, managed
over time.
49. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
ML and AI have a lot of requirements: no shortcuts
https://hackernoon.com/the-ai-hierarchy-of-needs-18f111fcc007
50. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
“Eliminate the time spent on data prep” – Nope
The work you do on the data is what makes it valuable.
You can’t eliminate the prep work without eliminating
good models. Instead, optimize workflows where most of
the time is spent.
51. Copyright Third Nature, Inc.
The Lake + self-service model: Individuals get and
manage their own data, Yes but…
53. Copyright Third Nature, Inc.
Self-service tradeoffs
Pay now or pay later,
but you will always pay.
The question is which
payment will be less.
Self-service gives
flexibility and agility, but
can reduce repeatability
and add duplicated
effort and conflicts, and
an increase in risk.
54. Copyright Third Nature, Inc.
A Data Science Approach to data
One-Pipeline-
Per-Process
Redundant Effort /
Cost / Complexity /
etc.
WELL
HEAD
WELL
HEAD
Example use cases
55. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Models in production at massive scale?
Most organizations have a project-based approach. This
makes it easy to deliver new projects.
With the silo/pipeline approach:
▪ If each model takes X% of effort to maintain, how many
models can you build before you use up 100% of your time?
▪ Automation helps, a little.
▪ Efficiency helps more.
The projects-as-silos
and pipeline approach
will not work when
running models in
production at the
massive scale required
for total automation
Numberofmodelsinproduction
2019 2029
Staffing
56. Moving from individual to shared environments
is harder than most vendors lead you to believe
57. It takes more than common tools to create a
functional environment
58. Copyright Third Nature, Inc.
The enterprise focus needs to be on
repeatability - where it can be supported
59. Copyright Third Nature, Inc.
The nature of data science and BI differs
• In data science, the data is unknown at the start. The process
creates a data model. The same schema may not be reusable.
• The equivalent to a report is not a model. That would be the
model’s output. The equivalent to a model is more like ETL.
• Data science may require access to more than one zone.
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
Source data Model extract Models
61. Copyright Third Nature, Inc.
Data can be maintained at multiple levels: not raw or DW
Ingredients
Goal: available
User needs a recipe
in order to make
use of the data.
Pre-mixed
Goal: discoverable
and integrateable
User needs a menu
to choose from the
data available
Meals
Goal: usable
User needs utensils
but is given a
finished meal
63. Copyright Third Nature, Inc.
Culture
The hard problem
is changing the
organization so
that it more
readily challenges
the rationale for
decisions, uses
data to back up
the discussion, and
creates new
explanations.
64. Copyright Third Nature, Inc.
Moving from predictable rule-based systems to complex
mathematical systems, and from there to systems that
exhibit stochasticity, makes the task harder, not different.
One thing worse than
a black box is a
random black box.
65. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Culture: Experimental Mindset
Sometimes you can’t build the thing
you want (meet the required OEC)
▪ ML is experimental, you should fail
▪ Budget to experiment – and fail?
▪ Data: type, quality, amount
▪ Technique: theoretical limits, appropriateness
▪ Feasiblity: technical, resources and time
Useful background for online experiments
https://www.researchgate.net/publication/316116834_
Online_Controlled_Experiments_and_AB_Testing
https://ai.stanford.edu/~ronnyk/2007GuideControlledEx
periments.pdf
66. Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Analysts and engineers work from opposing directions
exploration
modeling
integration
applications
infrastructure
help people ask the right questions,
frame them, define measurable goals
define models that run to determine
answers or carry out actions
deliver the results / product in
production, at scale
build data science models into
applications and delivery systems
provide the systems and practices to
build and run the desired models
67. Mark Madsen is an engineering fellow
at Teradata. Prior to that he was
president of Third Nature, a research
and consulting firm focused on
analytics, data integration and data
management. Mark is an award-
winning architect, author, and CTO
whose work has been featured in
numerous industry publications. He is
an international speaker and is
involved with several conferences in
the data science and analytics industry.
Mark Madsen