SlideShare uma empresa Scribd logo
1 de 67
Baixar para ler offline
Operationalizing
Machine Learning
in the Enterprise
TDWI / BARC
Munich
June 22, 2019
Mark Madsen
Copyright Third Nature, Inc.
PSA: The reality behind most ML/AI production case studies
TL;DR Embedded ML and AI is
harder than people realize.
Most companies will not be
able to do it at the current state
of IT and market maturity.
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Overview
We won’t:
▪ Talk much about development and building of
models, since this is about operations
▪ Talk about technology or techniques, since this is
about operations
We will:
▪ Talk a lot about concepts, observations, and practices
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
What does analytics management care about?
There is a key stakeholder:
analytics management - the
CAO, CDO, VP of analytics, aka
“your boss” if you’re a data
scientist.
The perspective and problems
of the person responsible for
oversight of the team and
efforts is across the
organization and across
multiple projects
Copyright Third Nature, Inc.
Explainability (or interpretability)
Copyright Third Nature, Inc.
Job #1 - Repeatability
Copyright Third Nature, Inc.
Job # 2 - Operational predictability
Copyright Third Nature, Inc.
Job #3 - Reproducibility
Copyright Third Nature, Inc.
Scientific
reproducibility
Can you get the same
results given the same
starting conditions?
▪ Experimental replication
may not be the same for
many reasons
▪ Detailed statistical
analysis is needed
▪ And critical assessment
of experiments
Copyright Third Nature, Inc.
Data science reproducibility
Can you get the same
results on the same
data?
▪ Direct replication is
expected
▪ Assumption is that
same input = same
output
▪ You can have this with
an unexplainable box
▪ But there are also
confounding factors
Copyright Third Nature, Inc.
Interpretability and reproducibility are driven by trust,
which only matters when there is enough risk
Regulation and compliance
Material decisions
Big penalties
Complication: the cost of false
positives and false negatives
are usually different, so model
characteristics matter here.
Cost of error
Frequency of
decision
Low High
High
Don’t care
Oh crap
Copyright Third Nature, Inc.
The real questions
Can I support this answer at a later date?
▪ Do I care?
• Is the risk (cost) worth worrying about?
• Is the cost of reproducibility less than the risk and cost?
▪ Do I only need to justify it?
• interpretability and trust may be enough
▪ Do I need to reproduce the results?
• May not need interpretability
• Need a lot of other things
Copyright Third Nature, Inc.
The real need is trust. Our trust is
based on all the elements that
are involved, not just the model.
The higher the stakes the more
you must think about all the ways
it could be wrong, because we all
want to be right.
Reliability and robustness of the
technology environment is as
important as the model.
Reproducibility is everyone’s
problem – it is an operational
concern.
Copyright Third Nature, Inc.
The technology components are relevant to reproducibility
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Starting with the process everyone does: building stuff
Slide 15
Define the
business problem
Translate the
problem into an
analytic context
Select appropriate
data
Learn the data
Create a model set
Fix problems with
data
Transform data
Build models
Assess models
Deploy models
Assess results
% of time spent
70% 30%
Source: Michael Berry, Data Miners Inc.
16
"Always design a thing by considering it in its next larger context - a
chair in a room, a room in a house, a house in an environment, an
environment in a city plan." – Eliel Saarinen
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Expanding the perspective beyond the initial bit
• There are upstream parts to the development process:
collecting and managing data, both for dev and in prod.
• There are downstream parts, in deployment and then in
production operation.
• Data and artifacts are exchanged as part of the workflows
Collect DeployBuild Operate
Copyright Third Nature, Inc.
Deploying autonomous ML is one of the biggest challenges
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
The operation workflow itself is complicated
Sense Process Interact Learn
Learning: could be human methods (manual adjustment) or
machine methods (e.g. reinforcement learning), which change the
sensing and processing.
Collect inputs
AKA clean data
“reasoning”
Execute model
to select and
post actions
“acting”
Perform the
desired actions
Measure,
compare,
adjust
Inputs Actions Obs
Operate
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Feedback requires lots of data that you must record
Sense Process Interact Learn
Data volumes explode with all the telemetry:
1 execution = raw data in, inputs, the action, each metric used
(expected values), execution log, metric data (actuals), deltas,
model changes, technical resource information
Record the
inputs
Record the action,
the expected
outcome (tracking
metrics and OEC
Record the
execution.
Record the
metric deltas
Record the
changes
Inputs Actions Obs
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Criteria for models: not just accuracy!
You must track performance in development and production relative
to the metrics that are most important, in addition to the OEC.
Predictive
Accuracy
Highly accurate
model
SpeedFast processing
time for training
and test, OR for
execution
Simplicity
Resulting model has few
parameters and is easy
to monitor and explain
Robustness
Results are
stable over time
Scalability
Model can handle growing
data volume and/or high
concurrency
Interpretability
Easy to
understand
model results
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
ML is not like code: it can get better in production
ML in production usually starts out at the expected level
and improves over time, if you are doing it right.
Continuous improvement is the norm.
ML always goes wrong at some point. The best way to
protect against that is to constantly monitor and test
models on real production data.
e.g. sometimes your training data is not representative in
unexpected ways.
Excessive Invariance Causes Adversarial Vulnerability
https://openreview.net/forum?id=BkfbpsAcF7
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
A (contrived) example: Detect dogs
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Dog or Not Dog?
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Dog or Not Dog: 100%! Um, why?
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Beyond toy examples, this problem matters
The AI edge case error problems
will limit AI applicability until they
are solved (don’t hold your breath).
The uncertainty challenge means:
▪ Calculate error costs and apply them
to your model before (and after)
▪ Use AI for low cost errors or HitL
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
You need to protect against model execution problems
Sense Process Interact Learn
You have to track the actions / executions and their results,
including the OEC, in real time, to protect against failures.
This adds monitors and circuit breakers.
Inputs Actions Obs
MonitorBreaker
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
ML Principle: CACE, Change Anything Change Everything
In embedded or
autonomous ML
everything is
connected.
Events usually
happen in real time.
ML is very sensitive
to context and input.
“So what if I
changed NumPy
in dev?”
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
“A production ML system is never all green”
Much of the time, the ML app is a distributed system.
Distributed systems are hard.
Monolithic architectures are great if you can use them.
This is fine.
Everything is fine
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Not just protection - diagnostics
Sense Process Interact Learn
You need telemetry about the entire environment for monitoring,
but you also need it for diagnostics.
This means you need to think about observability.
Inputs Actions Obs
MonitorBreaker
Copyright Third Nature, Inc.
Ask yourself “What could go wrong?” and you’ll
probably be right.
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Therefore: Test in Production?
Bad right? Goes against everything IT says about testing
▪ This is a culture change for IT, and a hard one to make.
But production is always-on, real time. Conditions are
constantly changing. You can’t replicate the environment.
If ML is highly sensitive to conditions, and you can’t
replicate the conditions exactly, then… test in production.
▪ On real data
▪ With real network and server configurations
▪ And real concurrency
But also:
▪ Keep testing in dev/QA and CI/CD environments.
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
ML is not like code: Monitoring in production
Unlike BI, ML has different metrics for “correct”
The metrics are relative and can change over time
You must monitor performance closely, which is
like doing BI on your AI.
“observability”, because a problem may not be
the model but the data, or the infrastructure.
▪ Reduce the time to diagnose, rather than
emphasizing the prevention of coding errors
Machine learning is the smallest part of the environment
ML
Code
Analysis Tools
Data
Collection
Machine
Resource
Management
Serving
Infrastructure
Feature
Extraction
Configuration
Data
Verification
Process
Management
Tools
Monitoring
https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems
©2018 Teradata
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
How an organization works
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Many applications, many activities
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
How do you get the full picture?
What does a person do when there’s a problem?
One report from each application isn’t a sustainable answer
?
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
The goal is to make decisions, not get reports
This is the “decision support” model
Analyze
Decide
Act
KPIs, metrics
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
We had batch (and stream) models decades ago
Analyze
Decide
Act
e.g. segmentation, NBO queue, churn, fraud
Usually batch, ran from the DW (but probably not on it),
resulting data loaded into the DW for use
Someone oversees and acts on the information
Analyze
KPIs, metrics
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Applying analytics within a process context
Analyze
Decide Act
e.g. purchasing changes, upsell/cross-sell recommendations
Machines gain agency, humans lose it; “act” is curtailed
This is the human-in-the-loop model
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
When there’s a problem, the fix is a message
The model’s results should be visible via the KPIs and metrics
Act: People call people to see what’s happening
KPIs, metrics
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Somebody built the models – the data scientist
More communication is required
The data scientist needs to observe and
change system behaviors
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Enter the black boxes – the “autonomous” model
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Black boxes still need oversight
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Black boxes beget gray boxes because of speed
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Three ML deployment categories
Decision support
(aka BI) is final
arbiter of success.
Autonomous
Human in
the loop
Decision
support
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Distributed Agency
Today’s application
model of ML is usually
embedded in a fixed
central system.
Governing a model and
it’s application is more
complex when the ML
system is not controlled
via a central service.
This level of autonomy
is still in its infancy. Cf
Roomba
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Three ML deployment categories
Autonomous
Human in
the loop
Decision
support
All of these
independent /
separate
architectures
are dependent
on some level
of shared
context.
That means
shared
operational
data, managed
over time.
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
ML and AI have a lot of requirements: no shortcuts
https://hackernoon.com/the-ai-hierarchy-of-needs-18f111fcc007
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
“Eliminate the time spent on data prep” – Nope
The work you do on the data is what makes it valuable.
You can’t eliminate the prep work without eliminating
good models. Instead, optimize workflows where most of
the time is spent.
Copyright Third Nature, Inc.
The Lake + self-service model: Individuals get and
manage their own data, Yes but…
Copyright Third Nature, Inc.
BYOT can lead to extreme behaviors
Copyright Third Nature, Inc.
Self-service tradeoffs
Pay now or pay later,
but you will always pay.
The question is which
payment will be less.
Self-service gives
flexibility and agility, but
can reduce repeatability
and add duplicated
effort and conflicts, and
an increase in risk.
Copyright Third Nature, Inc.
A Data Science Approach to data
One-Pipeline-
Per-Process
Redundant Effort /
Cost / Complexity /
etc.
WELL
HEAD
WELL
HEAD
Example use cases
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Models in production at massive scale?
Most organizations have a project-based approach. This
makes it easy to deliver new projects.
With the silo/pipeline approach:
▪ If each model takes X% of effort to maintain, how many
models can you build before you use up 100% of your time?
▪ Automation helps, a little.
▪ Efficiency helps more.
The projects-as-silos
and pipeline approach
will not work when
running models in
production at the
massive scale required
for total automation
Numberofmodelsinproduction
2019 2029
Staffing
Moving from individual to shared environments
is harder than most vendors lead you to believe
It takes more than common tools to create a
functional environment
Copyright Third Nature, Inc.
The enterprise focus needs to be on
repeatability - where it can be supported
Copyright Third Nature, Inc.
The nature of data science and BI differs
• In data science, the data is unknown at the start. The process
creates a data model. The same schema may not be reusable.
• The equivalent to a report is not a model. That would be the
model’s output. The equivalent to a model is more like ETL.
• Data science may require access to more than one zone.
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
Source data Model extract Models
Copyright Third Nature, Inc.
Managing data is a bigger problem than bigness
Copyright Third Nature, Inc.
Data can be maintained at multiple levels: not raw or DW
Ingredients
Goal: available
User needs a recipe
in order to make
use of the data.
Pre-mixed
Goal: discoverable
and integrateable
User needs a menu
to choose from the
data available
Meals
Goal: usable
User needs utensils
but is given a
finished meal
We need a discipline of AnalyticOps
We need to enable the
full end-to-end lifecycle.
No product will do this –
it’s a workflow, process,
and architecture problem.
external data
iteration
data-mining
statistics
value-driven
flexibility
exploration
discovery
modelling
blue-sky ideation
ANALYTICS
OPERATIONS
security
governance
compliance
curation
deployment
maintenance
integration
testing
engineering
process-driven
Plan and
Measure
Develop
and Test
Release
and
Deploy
Monitor
and
Optimize
©2018 Teradata
Copyright Third Nature, Inc.
Culture
The hard problem
is changing the
organization so
that it more
readily challenges
the rationale for
decisions, uses
data to back up
the discussion, and
creates new
explanations.
Copyright Third Nature, Inc.
Moving from predictable rule-based systems to complex
mathematical systems, and from there to systems that
exhibit stochasticity, makes the task harder, not different.
One thing worse than
a black box is a
random black box.
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Culture: Experimental Mindset
Sometimes you can’t build the thing
you want (meet the required OEC)
▪ ML is experimental, you should fail
▪ Budget to experiment – and fail?
▪ Data: type, quality, amount
▪ Technique: theoretical limits, appropriateness
▪ Feasiblity: technical, resources and time
Useful background for online experiments
https://www.researchgate.net/publication/316116834_
Online_Controlled_Experiments_and_AB_Testing
https://ai.stanford.edu/~ronnyk/2007GuideControlledEx
periments.pdf
Copyright Third Nature, Inc.Copyright Third Nature, Inc.
Analysts and engineers work from opposing directions
exploration
modeling
integration
applications
infrastructure
help people ask the right questions,
frame them, define measurable goals
define models that run to determine
answers or carry out actions
deliver the results / product in
production, at scale
build data science models into
applications and delivery systems
provide the systems and practices to
build and run the desired models
Mark Madsen is an engineering fellow
at Teradata. Prior to that he was
president of Third Nature, a research
and consulting firm focused on
analytics, data integration and data
management. Mark is an award-
winning architect, author, and CTO
whose work has been featured in
numerous industry publications. He is
an international speaker and is
involved with several conferences in
the data science and analytics industry.
Mark Madsen

Mais conteúdo relacionado

Mais procurados

Assumptions about Data and Analysis: Briefing room webcast slides
Assumptions about Data and Analysis: Briefing room webcast slidesAssumptions about Data and Analysis: Briefing room webcast slides
Assumptions about Data and Analysis: Briefing room webcast slidesmark madsen
 
Wake up and smell the data
Wake up and smell the dataWake up and smell the data
Wake up and smell the datamark madsen
 
Strata Data Conference 2019 : Scaling Visualization for Big Data in the Cloud
Strata Data Conference 2019 : Scaling Visualization for Big Data in the CloudStrata Data Conference 2019 : Scaling Visualization for Big Data in the Cloud
Strata Data Conference 2019 : Scaling Visualization for Big Data in the CloudJaipaul Agonus
 
How to Build Data Science Teams
How to Build Data Science TeamsHow to Build Data Science Teams
How to Build Data Science TeamsGanes Kesari
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Domino Data Lab
 
Intro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsIntro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsSri Ambati
 
Analytics 3.0 Measurable business impact from analytics & big data
Analytics 3.0 Measurable business impact from analytics & big dataAnalytics 3.0 Measurable business impact from analytics & big data
Analytics 3.0 Measurable business impact from analytics & big dataMicrosoft
 
What data scientists really do, according to 50 data scientists
What data scientists really do, according to 50 data scientistsWhat data scientists really do, according to 50 data scientists
What data scientists really do, according to 50 data scientistsHugo Bowne-Anderson
 
1. introduction to data science —
1. introduction to data science —1. introduction to data science —
1. introduction to data science —swethaT16
 
Everything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data WarehouseEverything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data Warehousemark madsen
 
Disruptive Innovation: how do you use these theories to manage your IT?
Disruptive Innovation: how do you use these theories to manage your IT?Disruptive Innovation: how do you use these theories to manage your IT?
Disruptive Innovation: how do you use these theories to manage your IT?mark madsen
 
Building Data Science Teams
Building Data Science TeamsBuilding Data Science Teams
Building Data Science TeamsEMC
 
Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)mark madsen
 
Big dataplatform operationalstrategy
Big dataplatform operationalstrategyBig dataplatform operationalstrategy
Big dataplatform operationalstrategyHimanshu Bari
 
O'Reilly ebook: Machine Learning at Enterprise Scale | Qubole
O'Reilly ebook: Machine Learning at Enterprise Scale | QuboleO'Reilly ebook: Machine Learning at Enterprise Scale | Qubole
O'Reilly ebook: Machine Learning at Enterprise Scale | QuboleVasu S
 
A data view of the data science process
A data view of the data science processA data view of the data science process
A data view of the data science processMathieu d'Aquin
 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterDomino Data Lab
 
The Other 99% of a Data Science Project
The Other 99% of a Data Science ProjectThe Other 99% of a Data Science Project
The Other 99% of a Data Science ProjectEugene Mandel
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data scienceVipul Kalamkar
 

Mais procurados (20)

Assumptions about Data and Analysis: Briefing room webcast slides
Assumptions about Data and Analysis: Briefing room webcast slidesAssumptions about Data and Analysis: Briefing room webcast slides
Assumptions about Data and Analysis: Briefing room webcast slides
 
Wake up and smell the data
Wake up and smell the dataWake up and smell the data
Wake up and smell the data
 
Strata Data Conference 2019 : Scaling Visualization for Big Data in the Cloud
Strata Data Conference 2019 : Scaling Visualization for Big Data in the CloudStrata Data Conference 2019 : Scaling Visualization for Big Data in the Cloud
Strata Data Conference 2019 : Scaling Visualization for Big Data in the Cloud
 
How to Build Data Science Teams
How to Build Data Science TeamsHow to Build Data Science Teams
How to Build Data Science Teams
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field
 
Intro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsIntro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data Scientists
 
Analytics 3.0 Measurable business impact from analytics & big data
Analytics 3.0 Measurable business impact from analytics & big dataAnalytics 3.0 Measurable business impact from analytics & big data
Analytics 3.0 Measurable business impact from analytics & big data
 
Machine Learning in Big Data
Machine Learning in Big DataMachine Learning in Big Data
Machine Learning in Big Data
 
What data scientists really do, according to 50 data scientists
What data scientists really do, according to 50 data scientistsWhat data scientists really do, according to 50 data scientists
What data scientists really do, according to 50 data scientists
 
1. introduction to data science —
1. introduction to data science —1. introduction to data science —
1. introduction to data science —
 
Everything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data WarehouseEverything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data Warehouse
 
Disruptive Innovation: how do you use these theories to manage your IT?
Disruptive Innovation: how do you use these theories to manage your IT?Disruptive Innovation: how do you use these theories to manage your IT?
Disruptive Innovation: how do you use these theories to manage your IT?
 
Building Data Science Teams
Building Data Science TeamsBuilding Data Science Teams
Building Data Science Teams
 
Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)
 
Big dataplatform operationalstrategy
Big dataplatform operationalstrategyBig dataplatform operationalstrategy
Big dataplatform operationalstrategy
 
O'Reilly ebook: Machine Learning at Enterprise Scale | Qubole
O'Reilly ebook: Machine Learning at Enterprise Scale | QuboleO'Reilly ebook: Machine Learning at Enterprise Scale | Qubole
O'Reilly ebook: Machine Learning at Enterprise Scale | Qubole
 
A data view of the data science process
A data view of the data science processA data view of the data science process
A data view of the data science process
 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with Jupyter
 
The Other 99% of a Data Science Project
The Other 99% of a Data Science ProjectThe Other 99% of a Data Science Project
The Other 99% of a Data Science Project
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data science
 

Semelhante a Operationalizing Machine Learning in the Enterprise

Putting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPutting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPeculium Crypto
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesDianaGray10
 
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...Dario Mangano
 
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...Big Data Week
 
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...NadinaLisbon1
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products Dataiku
 
Questions On Technical Design Decisions
Questions On Technical Design DecisionsQuestions On Technical Design Decisions
Questions On Technical Design DecisionsRikki Wright
 
Transform Banking with Big Data and Automated Machine Learning 9.12.17
Transform Banking with Big Data and Automated Machine Learning 9.12.17Transform Banking with Big Data and Automated Machine Learning 9.12.17
Transform Banking with Big Data and Automated Machine Learning 9.12.17Cloudera, Inc.
 
Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...
Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...
Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...Matt Stubbs
 
Making better use of Data and AI in Industry 4.0
Making better use of Data and AI in Industry 4.0Making better use of Data and AI in Industry 4.0
Making better use of Data and AI in Industry 4.0Albert Y. C. Chen
 
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)Leslie McFarlin
 
Ai and Design: When, Why and How? - Morgenbooster
Ai and Design: When, Why and How? - MorgenboosterAi and Design: When, Why and How? - Morgenbooster
Ai and Design: When, Why and How? - Morgenbooster1508 A/S
 
Industrializing Data Science: Transform into an End-to-End, Analytics-Oriente...
Industrializing Data Science: Transform into an End-to-End, Analytics-Oriente...Industrializing Data Science: Transform into an End-to-End, Analytics-Oriente...
Industrializing Data Science: Transform into an End-to-End, Analytics-Oriente...Dana Gardner
 
(In)convenient truths about applied machine learning
(In)convenient truths about applied machine learning(In)convenient truths about applied machine learning
(In)convenient truths about applied machine learningMax Pagels
 
Machine learning at b.e.s.t. summer university
Machine learning  at b.e.s.t. summer universityMachine learning  at b.e.s.t. summer university
Machine learning at b.e.s.t. summer universityLászló Kovács
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionInside Analysis
 

Semelhante a Operationalizing Machine Learning in the Enterprise (20)

Putting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPutting data science in your business a first utility feedback
Putting data science in your business a first utility feedback
 
Demystifying ML/AI
Demystifying ML/AIDemystifying ML/AI
Demystifying ML/AI
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practices
 
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
 
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...
 
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products
 
Questions On Technical Design Decisions
Questions On Technical Design DecisionsQuestions On Technical Design Decisions
Questions On Technical Design Decisions
 
Transform Banking with Big Data and Automated Machine Learning 9.12.17
Transform Banking with Big Data and Automated Machine Learning 9.12.17Transform Banking with Big Data and Automated Machine Learning 9.12.17
Transform Banking with Big Data and Automated Machine Learning 9.12.17
 
Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...
Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...
Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...
 
Managing machine learning
Managing machine learningManaging machine learning
Managing machine learning
 
Intro to ai application emeritus uob-final
Intro to ai application emeritus uob-finalIntro to ai application emeritus uob-final
Intro to ai application emeritus uob-final
 
Making better use of Data and AI in Industry 4.0
Making better use of Data and AI in Industry 4.0Making better use of Data and AI in Industry 4.0
Making better use of Data and AI in Industry 4.0
 
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
 
Better the devil you know
Better the devil you knowBetter the devil you know
Better the devil you know
 
Ai and Design: When, Why and How? - Morgenbooster
Ai and Design: When, Why and How? - MorgenboosterAi and Design: When, Why and How? - Morgenbooster
Ai and Design: When, Why and How? - Morgenbooster
 
Industrializing Data Science: Transform into an End-to-End, Analytics-Oriente...
Industrializing Data Science: Transform into an End-to-End, Analytics-Oriente...Industrializing Data Science: Transform into an End-to-End, Analytics-Oriente...
Industrializing Data Science: Transform into an End-to-End, Analytics-Oriente...
 
(In)convenient truths about applied machine learning
(In)convenient truths about applied machine learning(In)convenient truths about applied machine learning
(In)convenient truths about applied machine learning
 
Machine learning at b.e.s.t. summer university
Machine learning  at b.e.s.t. summer universityMachine learning  at b.e.s.t. summer university
Machine learning at b.e.s.t. summer university
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop Adoption
 

Mais de mark madsen

A Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou Range
A Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou RangeA Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou Range
A Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou Rangemark madsen
 
A Pragmatic Approach to Analyzing Customers
A Pragmatic Approach to Analyzing CustomersA Pragmatic Approach to Analyzing Customers
A Pragmatic Approach to Analyzing Customersmark madsen
 
Briefing room: An alternative for streaming data collection
Briefing room: An alternative for streaming data collectionBriefing room: An alternative for streaming data collection
Briefing room: An alternative for streaming data collectionmark madsen
 
Building the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architectureBuilding the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architecturemark madsen
 
Briefing Room analyst comments - streaming analytics
Briefing Room analyst comments - streaming analyticsBriefing Room analyst comments - streaming analytics
Briefing Room analyst comments - streaming analyticsmark madsen
 
Everything has changed except us
Everything has changed except usEverything has changed except us
Everything has changed except usmark madsen
 
On the edge: analytics for the modern enterprise (analyst comments)
On the edge: analytics for the modern enterprise (analyst comments)On the edge: analytics for the modern enterprise (analyst comments)
On the edge: analytics for the modern enterprise (analyst comments)mark madsen
 
Crossing the chasm with a high performance dynamically scalable open source p...
Crossing the chasm with a high performance dynamically scalable open source p...Crossing the chasm with a high performance dynamically scalable open source p...
Crossing the chasm with a high performance dynamically scalable open source p...mark madsen
 
Don't let data get in the way of a good story
Don't let data get in the way of a good storyDon't let data get in the way of a good story
Don't let data get in the way of a good storymark madsen
 
Big Data and Bad Analogies
Big Data and Bad AnalogiesBig Data and Bad Analogies
Big Data and Bad Analogiesmark madsen
 
Don't follow the followers
Don't follow the followersDon't follow the followers
Don't follow the followersmark madsen
 
Exploring cloud for data warehousing
Exploring cloud for data warehousingExploring cloud for data warehousing
Exploring cloud for data warehousingmark madsen
 
Open Data: Free Data Isn't the Same as Freeing Data
Open Data: Free Data Isn't the Same as Freeing DataOpen Data: Free Data Isn't the Same as Freeing Data
Open Data: Free Data Isn't the Same as Freeing Datamark madsen
 
Exploring cloud for data warehousing
Exploring cloud for data warehousingExploring cloud for data warehousing
Exploring cloud for data warehousingmark madsen
 
Big Data Wonderland: Two Views on the Big Data Revolution
Big Data Wonderland: Two Views on the Big Data RevolutionBig Data Wonderland: Two Views on the Big Data Revolution
Big Data Wonderland: Two Views on the Big Data Revolutionmark madsen
 
Using Data Virtualization to Integrate With Big Data
Using Data Virtualization to Integrate With Big DataUsing Data Virtualization to Integrate With Big Data
Using Data Virtualization to Integrate With Big Datamark madsen
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolutionmark madsen
 

Mais de mark madsen (17)

A Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou Range
A Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou RangeA Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou Range
A Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou Range
 
A Pragmatic Approach to Analyzing Customers
A Pragmatic Approach to Analyzing CustomersA Pragmatic Approach to Analyzing Customers
A Pragmatic Approach to Analyzing Customers
 
Briefing room: An alternative for streaming data collection
Briefing room: An alternative for streaming data collectionBriefing room: An alternative for streaming data collection
Briefing room: An alternative for streaming data collection
 
Building the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architectureBuilding the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architecture
 
Briefing Room analyst comments - streaming analytics
Briefing Room analyst comments - streaming analyticsBriefing Room analyst comments - streaming analytics
Briefing Room analyst comments - streaming analytics
 
Everything has changed except us
Everything has changed except usEverything has changed except us
Everything has changed except us
 
On the edge: analytics for the modern enterprise (analyst comments)
On the edge: analytics for the modern enterprise (analyst comments)On the edge: analytics for the modern enterprise (analyst comments)
On the edge: analytics for the modern enterprise (analyst comments)
 
Crossing the chasm with a high performance dynamically scalable open source p...
Crossing the chasm with a high performance dynamically scalable open source p...Crossing the chasm with a high performance dynamically scalable open source p...
Crossing the chasm with a high performance dynamically scalable open source p...
 
Don't let data get in the way of a good story
Don't let data get in the way of a good storyDon't let data get in the way of a good story
Don't let data get in the way of a good story
 
Big Data and Bad Analogies
Big Data and Bad AnalogiesBig Data and Bad Analogies
Big Data and Bad Analogies
 
Don't follow the followers
Don't follow the followersDon't follow the followers
Don't follow the followers
 
Exploring cloud for data warehousing
Exploring cloud for data warehousingExploring cloud for data warehousing
Exploring cloud for data warehousing
 
Open Data: Free Data Isn't the Same as Freeing Data
Open Data: Free Data Isn't the Same as Freeing DataOpen Data: Free Data Isn't the Same as Freeing Data
Open Data: Free Data Isn't the Same as Freeing Data
 
Exploring cloud for data warehousing
Exploring cloud for data warehousingExploring cloud for data warehousing
Exploring cloud for data warehousing
 
Big Data Wonderland: Two Views on the Big Data Revolution
Big Data Wonderland: Two Views on the Big Data RevolutionBig Data Wonderland: Two Views on the Big Data Revolution
Big Data Wonderland: Two Views on the Big Data Revolution
 
Using Data Virtualization to Integrate With Big Data
Using Data Virtualization to Integrate With Big DataUsing Data Virtualization to Integrate With Big Data
Using Data Virtualization to Integrate With Big Data
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolution
 

Último

Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 

Último (20)

Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 

Operationalizing Machine Learning in the Enterprise

  • 1. Operationalizing Machine Learning in the Enterprise TDWI / BARC Munich June 22, 2019 Mark Madsen
  • 2. Copyright Third Nature, Inc. PSA: The reality behind most ML/AI production case studies TL;DR Embedded ML and AI is harder than people realize. Most companies will not be able to do it at the current state of IT and market maturity.
  • 3. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Overview We won’t: ▪ Talk much about development and building of models, since this is about operations ▪ Talk about technology or techniques, since this is about operations We will: ▪ Talk a lot about concepts, observations, and practices
  • 4. Copyright Third Nature, Inc.Copyright Third Nature, Inc. What does analytics management care about? There is a key stakeholder: analytics management - the CAO, CDO, VP of analytics, aka “your boss” if you’re a data scientist. The perspective and problems of the person responsible for oversight of the team and efforts is across the organization and across multiple projects
  • 5. Copyright Third Nature, Inc. Explainability (or interpretability)
  • 6. Copyright Third Nature, Inc. Job #1 - Repeatability
  • 7. Copyright Third Nature, Inc. Job # 2 - Operational predictability
  • 8. Copyright Third Nature, Inc. Job #3 - Reproducibility
  • 9. Copyright Third Nature, Inc. Scientific reproducibility Can you get the same results given the same starting conditions? ▪ Experimental replication may not be the same for many reasons ▪ Detailed statistical analysis is needed ▪ And critical assessment of experiments
  • 10. Copyright Third Nature, Inc. Data science reproducibility Can you get the same results on the same data? ▪ Direct replication is expected ▪ Assumption is that same input = same output ▪ You can have this with an unexplainable box ▪ But there are also confounding factors
  • 11. Copyright Third Nature, Inc. Interpretability and reproducibility are driven by trust, which only matters when there is enough risk Regulation and compliance Material decisions Big penalties Complication: the cost of false positives and false negatives are usually different, so model characteristics matter here. Cost of error Frequency of decision Low High High Don’t care Oh crap
  • 12. Copyright Third Nature, Inc. The real questions Can I support this answer at a later date? ▪ Do I care? • Is the risk (cost) worth worrying about? • Is the cost of reproducibility less than the risk and cost? ▪ Do I only need to justify it? • interpretability and trust may be enough ▪ Do I need to reproduce the results? • May not need interpretability • Need a lot of other things
  • 13. Copyright Third Nature, Inc. The real need is trust. Our trust is based on all the elements that are involved, not just the model. The higher the stakes the more you must think about all the ways it could be wrong, because we all want to be right. Reliability and robustness of the technology environment is as important as the model. Reproducibility is everyone’s problem – it is an operational concern.
  • 14. Copyright Third Nature, Inc. The technology components are relevant to reproducibility
  • 15. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Starting with the process everyone does: building stuff Slide 15 Define the business problem Translate the problem into an analytic context Select appropriate data Learn the data Create a model set Fix problems with data Transform data Build models Assess models Deploy models Assess results % of time spent 70% 30% Source: Michael Berry, Data Miners Inc.
  • 16. 16 "Always design a thing by considering it in its next larger context - a chair in a room, a room in a house, a house in an environment, an environment in a city plan." – Eliel Saarinen
  • 17. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Expanding the perspective beyond the initial bit • There are upstream parts to the development process: collecting and managing data, both for dev and in prod. • There are downstream parts, in deployment and then in production operation. • Data and artifacts are exchanged as part of the workflows Collect DeployBuild Operate
  • 18. Copyright Third Nature, Inc. Deploying autonomous ML is one of the biggest challenges
  • 19. Copyright Third Nature, Inc.Copyright Third Nature, Inc. The operation workflow itself is complicated Sense Process Interact Learn Learning: could be human methods (manual adjustment) or machine methods (e.g. reinforcement learning), which change the sensing and processing. Collect inputs AKA clean data “reasoning” Execute model to select and post actions “acting” Perform the desired actions Measure, compare, adjust Inputs Actions Obs Operate
  • 20. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Feedback requires lots of data that you must record Sense Process Interact Learn Data volumes explode with all the telemetry: 1 execution = raw data in, inputs, the action, each metric used (expected values), execution log, metric data (actuals), deltas, model changes, technical resource information Record the inputs Record the action, the expected outcome (tracking metrics and OEC Record the execution. Record the metric deltas Record the changes Inputs Actions Obs
  • 21. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Criteria for models: not just accuracy! You must track performance in development and production relative to the metrics that are most important, in addition to the OEC. Predictive Accuracy Highly accurate model SpeedFast processing time for training and test, OR for execution Simplicity Resulting model has few parameters and is easy to monitor and explain Robustness Results are stable over time Scalability Model can handle growing data volume and/or high concurrency Interpretability Easy to understand model results
  • 22. Copyright Third Nature, Inc.Copyright Third Nature, Inc. ML is not like code: it can get better in production ML in production usually starts out at the expected level and improves over time, if you are doing it right. Continuous improvement is the norm. ML always goes wrong at some point. The best way to protect against that is to constantly monitor and test models on real production data. e.g. sometimes your training data is not representative in unexpected ways. Excessive Invariance Causes Adversarial Vulnerability https://openreview.net/forum?id=BkfbpsAcF7
  • 23. Copyright Third Nature, Inc.Copyright Third Nature, Inc. A (contrived) example: Detect dogs
  • 24. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Dog or Not Dog?
  • 25. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Dog or Not Dog: 100%! Um, why?
  • 26. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Beyond toy examples, this problem matters The AI edge case error problems will limit AI applicability until they are solved (don’t hold your breath). The uncertainty challenge means: ▪ Calculate error costs and apply them to your model before (and after) ▪ Use AI for low cost errors or HitL
  • 27. Copyright Third Nature, Inc.Copyright Third Nature, Inc. You need to protect against model execution problems Sense Process Interact Learn You have to track the actions / executions and their results, including the OEC, in real time, to protect against failures. This adds monitors and circuit breakers. Inputs Actions Obs MonitorBreaker
  • 28. Copyright Third Nature, Inc.Copyright Third Nature, Inc. ML Principle: CACE, Change Anything Change Everything In embedded or autonomous ML everything is connected. Events usually happen in real time. ML is very sensitive to context and input. “So what if I changed NumPy in dev?”
  • 29. Copyright Third Nature, Inc.Copyright Third Nature, Inc. “A production ML system is never all green” Much of the time, the ML app is a distributed system. Distributed systems are hard. Monolithic architectures are great if you can use them. This is fine. Everything is fine
  • 30. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Not just protection - diagnostics Sense Process Interact Learn You need telemetry about the entire environment for monitoring, but you also need it for diagnostics. This means you need to think about observability. Inputs Actions Obs MonitorBreaker
  • 31. Copyright Third Nature, Inc. Ask yourself “What could go wrong?” and you’ll probably be right.
  • 32. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Therefore: Test in Production? Bad right? Goes against everything IT says about testing ▪ This is a culture change for IT, and a hard one to make. But production is always-on, real time. Conditions are constantly changing. You can’t replicate the environment. If ML is highly sensitive to conditions, and you can’t replicate the conditions exactly, then… test in production. ▪ On real data ▪ With real network and server configurations ▪ And real concurrency But also: ▪ Keep testing in dev/QA and CI/CD environments.
  • 33. Copyright Third Nature, Inc.Copyright Third Nature, Inc. ML is not like code: Monitoring in production Unlike BI, ML has different metrics for “correct” The metrics are relative and can change over time You must monitor performance closely, which is like doing BI on your AI. “observability”, because a problem may not be the model but the data, or the infrastructure. ▪ Reduce the time to diagnose, rather than emphasizing the prevention of coding errors
  • 34. Machine learning is the smallest part of the environment ML Code Analysis Tools Data Collection Machine Resource Management Serving Infrastructure Feature Extraction Configuration Data Verification Process Management Tools Monitoring https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems ©2018 Teradata
  • 35. Copyright Third Nature, Inc.Copyright Third Nature, Inc. How an organization works
  • 36. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Many applications, many activities
  • 37. Copyright Third Nature, Inc.Copyright Third Nature, Inc. How do you get the full picture? What does a person do when there’s a problem? One report from each application isn’t a sustainable answer ?
  • 38. Copyright Third Nature, Inc.Copyright Third Nature, Inc. The goal is to make decisions, not get reports This is the “decision support” model Analyze Decide Act KPIs, metrics
  • 39. Copyright Third Nature, Inc.Copyright Third Nature, Inc. We had batch (and stream) models decades ago Analyze Decide Act e.g. segmentation, NBO queue, churn, fraud Usually batch, ran from the DW (but probably not on it), resulting data loaded into the DW for use Someone oversees and acts on the information Analyze KPIs, metrics
  • 40. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Applying analytics within a process context Analyze Decide Act e.g. purchasing changes, upsell/cross-sell recommendations Machines gain agency, humans lose it; “act” is curtailed This is the human-in-the-loop model
  • 41. Copyright Third Nature, Inc.Copyright Third Nature, Inc. When there’s a problem, the fix is a message The model’s results should be visible via the KPIs and metrics Act: People call people to see what’s happening KPIs, metrics
  • 42. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Somebody built the models – the data scientist More communication is required The data scientist needs to observe and change system behaviors
  • 43. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Enter the black boxes – the “autonomous” model
  • 44. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Black boxes still need oversight
  • 45. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Black boxes beget gray boxes because of speed
  • 46. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Three ML deployment categories Decision support (aka BI) is final arbiter of success. Autonomous Human in the loop Decision support
  • 47. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Distributed Agency Today’s application model of ML is usually embedded in a fixed central system. Governing a model and it’s application is more complex when the ML system is not controlled via a central service. This level of autonomy is still in its infancy. Cf Roomba
  • 48. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Three ML deployment categories Autonomous Human in the loop Decision support All of these independent / separate architectures are dependent on some level of shared context. That means shared operational data, managed over time.
  • 49. Copyright Third Nature, Inc.Copyright Third Nature, Inc. ML and AI have a lot of requirements: no shortcuts https://hackernoon.com/the-ai-hierarchy-of-needs-18f111fcc007
  • 50. Copyright Third Nature, Inc.Copyright Third Nature, Inc. “Eliminate the time spent on data prep” – Nope The work you do on the data is what makes it valuable. You can’t eliminate the prep work without eliminating good models. Instead, optimize workflows where most of the time is spent.
  • 51. Copyright Third Nature, Inc. The Lake + self-service model: Individuals get and manage their own data, Yes but…
  • 52. Copyright Third Nature, Inc. BYOT can lead to extreme behaviors
  • 53. Copyright Third Nature, Inc. Self-service tradeoffs Pay now or pay later, but you will always pay. The question is which payment will be less. Self-service gives flexibility and agility, but can reduce repeatability and add duplicated effort and conflicts, and an increase in risk.
  • 54. Copyright Third Nature, Inc. A Data Science Approach to data One-Pipeline- Per-Process Redundant Effort / Cost / Complexity / etc. WELL HEAD WELL HEAD Example use cases
  • 55. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Models in production at massive scale? Most organizations have a project-based approach. This makes it easy to deliver new projects. With the silo/pipeline approach: ▪ If each model takes X% of effort to maintain, how many models can you build before you use up 100% of your time? ▪ Automation helps, a little. ▪ Efficiency helps more. The projects-as-silos and pipeline approach will not work when running models in production at the massive scale required for total automation Numberofmodelsinproduction 2019 2029 Staffing
  • 56. Moving from individual to shared environments is harder than most vendors lead you to believe
  • 57. It takes more than common tools to create a functional environment
  • 58. Copyright Third Nature, Inc. The enterprise focus needs to be on repeatability - where it can be supported
  • 59. Copyright Third Nature, Inc. The nature of data science and BI differs • In data science, the data is unknown at the start. The process creates a data model. The same schema may not be reusable. • The equivalent to a report is not a model. That would be the model’s output. The equivalent to a model is more like ETL. • Data science may require access to more than one zone. 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA Source data Model extract Models
  • 60. Copyright Third Nature, Inc. Managing data is a bigger problem than bigness
  • 61. Copyright Third Nature, Inc. Data can be maintained at multiple levels: not raw or DW Ingredients Goal: available User needs a recipe in order to make use of the data. Pre-mixed Goal: discoverable and integrateable User needs a menu to choose from the data available Meals Goal: usable User needs utensils but is given a finished meal
  • 62. We need a discipline of AnalyticOps We need to enable the full end-to-end lifecycle. No product will do this – it’s a workflow, process, and architecture problem. external data iteration data-mining statistics value-driven flexibility exploration discovery modelling blue-sky ideation ANALYTICS OPERATIONS security governance compliance curation deployment maintenance integration testing engineering process-driven Plan and Measure Develop and Test Release and Deploy Monitor and Optimize ©2018 Teradata
  • 63. Copyright Third Nature, Inc. Culture The hard problem is changing the organization so that it more readily challenges the rationale for decisions, uses data to back up the discussion, and creates new explanations.
  • 64. Copyright Third Nature, Inc. Moving from predictable rule-based systems to complex mathematical systems, and from there to systems that exhibit stochasticity, makes the task harder, not different. One thing worse than a black box is a random black box.
  • 65. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Culture: Experimental Mindset Sometimes you can’t build the thing you want (meet the required OEC) ▪ ML is experimental, you should fail ▪ Budget to experiment – and fail? ▪ Data: type, quality, amount ▪ Technique: theoretical limits, appropriateness ▪ Feasiblity: technical, resources and time Useful background for online experiments https://www.researchgate.net/publication/316116834_ Online_Controlled_Experiments_and_AB_Testing https://ai.stanford.edu/~ronnyk/2007GuideControlledEx periments.pdf
  • 66. Copyright Third Nature, Inc.Copyright Third Nature, Inc. Analysts and engineers work from opposing directions exploration modeling integration applications infrastructure help people ask the right questions, frame them, define measurable goals define models that run to determine answers or carry out actions deliver the results / product in production, at scale build data science models into applications and delivery systems provide the systems and practices to build and run the desired models
  • 67. Mark Madsen is an engineering fellow at Teradata. Prior to that he was president of Third Nature, a research and consulting firm focused on analytics, data integration and data management. Mark is an award- winning architect, author, and CTO whose work has been featured in numerous industry publications. He is an international speaker and is involved with several conferences in the data science and analytics industry. Mark Madsen