SlideShare uma empresa Scribd logo
1 de 45
© 2017 MapR Technologies 1
Machine Learning Model Management
The working of the rendezvous framework
© 2017 MapR Technologies 2
Contact Information
Ted Dunning, PhD
Chief Application Architect, MapR Technologies
Committer, PMC member, board member, ASF
O’Reilly author
Email tdunning@mapr.com tdunning@apache.org
Twitter @Ted_Dunning
© 2017 MapR Technologies 3
Traditional View
© 2017 MapR Technologies 4
Traditional View: This isn’t the whole story
© 2017 MapR Technologies 5
90% of the effort in successful machine
learning isn’t in the training or model dev…
It’s the logistics
© 2017 MapR Technologies 6
Rendezvous Architecture
Input Scores
RendezvousModel 1
Model 2
Model 3
request
response
Results
© 2017 MapR Technologies 7
What We Ultimately Want
request
response
Model
© 2017 MapR Technologies 8
But This Isn’t The Answer
Model 1
request
response
Load
balancer
Model 2
Model 3
© 2017 MapR Technologies 9
First Try with Streams
Input
Model 1
Model 2
Model 3
request
response
?
© 2017 MapR Technologies 10
First Rendezvous
Input Scores
RendezvousModel 1
Model 2
Model 3
request
response
Results
© 2017 MapR Technologies 11
Some Key Points
• Note that all models see identical inputs
• All models run in production setting
• All models send scores to same stream
• The rendezvous server decides which scores to ignore
• Roll forward, roll back, correlated comparison are all now trivial
© 2017 MapR Technologies 12
Reality Check, Injecting External State
Model 1
Model 2
Model 3
request
Raw
Add
external
data
Input
Database
The world
© 2017 MapR Technologies 13
Recording Raw Data (as it really was)
Input
Scores
Decoy
Model 2
Model 3
Archive
© 2017 MapR Technologies 14
Quality & Reproducibility of Input Data is Important!
• Recording raw-ish data is really a big deal
– Data as seen by a model is worth gold
– Data reconstructed later often has time-machine leaks
– Databases were made for updates, streams are safer
• Raw data is useful for non-ML cases as well (think flexibility)
• Decoy model records training data as seen by models under
development & evaluation
© 2017 MapR Technologies 15
Canary for Comparison
Real
model
∆
Result
Canary
Decoy
Archive
Input
© 2017 MapR Technologies 16
What Does the Canary Do?
• The canary is a real model, but is very rarely updated
• The canary results are almost never used for decisioning
• The virtue of the canary is stability
• Comparing to the canary results gives insight into new models
© 2017 MapR Technologies 17
Isolated Development With Stream Replication
Model 1
Model 2
Model 3
request
Raw
Add
external
data
Input
Internal 1
Internal 2
Internal 3
The world
Model 4
Raw
New
external
data
Input
Internal 4
Production
Development
© 2017 MapR Technologies 18
Scores
ArchiveDecoy
m1
m2
m3
Features /
profiles
InputRaw
© 2017 MapR Technologies 19
ResultsRendezvousScores
ArchiveDecoy
m1
m2
m3
Features /
profiles
InputRaw
© 2017 MapR Technologies 20
Metrics
Metrics
ResultsRendezvousScores
ArchiveDecoy
m1
m2
m3
Features /
profiles
InputRaw
© 2017 MapR Technologies 21
Some Details
• Inside the rendezvous server
– Message contents … highlight return address
– Rendezvous mailbox
– Schedule ideas
• Inside a model container
– Identical inputs makes scaling easy
– Nearly stateless models
– Streaming shims, latency rig
© 2017 MapR Technologies 22
Message Content
• Input request contains request data plus administrivia
{
timestamp: 1501020498314,
messageId: "2a5f2b61fdd848d7954a51b49c2a9e2c",
return: "proxy-217"
provenance: { ... },
diagnostics: { ... },
... application specific data here ..
}
© 2017 MapR Technologies 23
Rendezvous Schedules
• Simple part
– Up to deadline, accept preferred models
– Up to next deadline, accept more models
– Near final deadline, accept default answer
• But also some probabilistic choice
• And also consider external experimental control
– Inject as external state
– Use in rendezvous to select model result
– Open question how much power to expose
© 2017 MapR Technologies 24
The rendezvous server is simpler
than it looks at first
© 2017 MapR Technologies 25
Model Life Cycle
• Developer / modeler produces container spec
– And uses this to build their development article
• QA inspects container spec
– And uses this to build a test article
• Security inspects container spec
– And uses this to build final artifact
• Important to use tools like Grafeas to inspect supply chain
http://bit.ly/grafeas
• Important that each step be inspectable
© 2017 MapR Technologies 26
Almost all of the framework scales by
trivial parallelism
© 2017 MapR Technologies 27
Scaling Up
• Note about streams
– At millions of updates per server, the streams aren’t part of the streaming
question
• Scaling up state injection
– Partition raw input, replicate state injector
– Beware external throughput limits
– State injection does avoid duplicate queries
• Scaling up models
– Stateless models allow trivial scaling
– Sequence state typically also trivial to scale
• Scaling up the rendezvous
– Match partition on raw and scores
– Replicate trivially
© 2017 MapR Technologies 28
Metrics
Metrics
ResultsRendezvousScores
ArchiveDecoy
m1
m2
m3
Features /
profiles
InputRaw
© 2017 MapR Technologies 29
Metrics
Metrics
ResultsRendezvousScores
ArchiveDecoy
m1
m2
m3
Features /
profiles
InputRaw
© 2017 MapR Technologies 30
Metrics
Metrics
ResultsRendezvousScores
ArchiveDecoy
m1
m2
m3
Features /
profiles
InputRaw
© 2017 MapR Technologies 31
Metrics
Metrics
ResultsRendezvousScores
ArchiveDecoy
m1
m2
m3
Features /
profiles
InputRaw
© 2017 MapR Technologies 32
In-place update of the framework via
modified Chandry-Lamport
© 2017 MapR Technologies 33
Transition Message
Input
Features /
profiles
Raw
© 2017 MapR Technologies 34
Transition Message
Features /
profiles
Input
Features /
profiles
Raw
© 2017 MapR Technologies 35
Transition Message
Features /
profiles
Features /
profiles
InputRaw
© 2017 MapR Technologies 36
Summary:
This is easy-ish
© 2017 MapR Technologies 37
Summary:
This is easy-ish
© 2017 MapR Technologies 38
Summary:
This is easy-ish
Well, it isn’t real hard
© 2017 MapR Technologies 39
First Rendezvous
Input Scores
RendezvousModel 1
Model 2
Model 3
request
response
Results
© 2017 MapR Technologies 40
Additional Resources
O’Reilly report by Ted Dunning & Ellen Friedman © March 2017
Read free courtesy of MapR:
https://mapr.com/geo-distribution-big-data-and-analytics/
O’Reilly book by Ted Dunning & Ellen Friedman
© March 2016
Read free courtesy of MapR:
https://mapr.com/streaming-architecture-using-
apache-kafka-mapr-streams/
© 2017 MapR Technologies 41
Additional Resources
O’Reilly book by Ted Dunning & Ellen Friedman
© June 2014
Read free courtesy of MapR:
https://mapr.com/practical-machine-learning-
new-look-anomaly-detection/
O’Reilly book by Ellen Friedman & Ted Dunning
© February 2014
Read free courtesy of MapR:
https://mapr.com/practical-machine-learning/
© 2017 MapR Technologies 42
Additional Resources
by Ellen Friedman 8 Aug 2017 on MapR blog:
https://mapr.com/blog/tensorflow-mxnet-caffe-h2o-which-ml-best/
by Ted Dunning 13 Sept 2017 in
InfoWorld:
https://www.infoworld.com/article/3223
688/machine-learning/machine-
learning-skills-for-software-
engineers.html
© 2017 MapR Technologies 43
New book: Machine Learning Logistics
Model Management in the Real World
O’Reilly book by Ellen Friedman & Ted Dunning © Sept 2017
Download free from MapR
http://info.mapr.com/2017_Content_Machine-Learning-
Logistics_eBook_Prereg_RegistrationPage.html
Going to Strata Data NYC? Book will be released 26 Sept 2017:
Visit MapR booth for free book signings or to talk about logistics
© 2017 MapR Technologies 44
Please support women in tech – help build
girls’ dreams of what they can accomplish
© Ellen Friedman 2015#womenintech #datawomen
© 2017 MapR Technologies 45
Q&A
@mapr
tdunning@mapr.com
ENGAGE WITH US
@ Ted_Dunning

Mais conteúdo relacionado

Mais procurados

Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionMapR Technologies
 
Applying Machine Learning to Live Patient Data
Applying Machine Learning to  Live Patient DataApplying Machine Learning to  Live Patient Data
Applying Machine Learning to Live Patient DataCarol McDonald
 
How Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health CareHow Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health CareCarol McDonald
 
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...Carol McDonald
 
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Carol McDonald
 
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DBStructured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DBCarol McDonald
 
MapR and Machine Learning Primer
MapR and Machine Learning PrimerMapR and Machine Learning Primer
MapR and Machine Learning PrimerMathieu Dumoulin
 
Plume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis LibraryPlume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis LibraryTigerGraph
 
Predicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine LearningPredicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine LearningCarol McDonald
 
Advanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming DataAdvanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming DataCarol McDonald
 
Graph Databases and Machine Learning | November 2018
Graph Databases and Machine Learning | November 2018Graph Databases and Machine Learning | November 2018
Graph Databases and Machine Learning | November 2018TigerGraph
 
Machine Learning Logistics
Machine Learning LogisticsMachine Learning Logistics
Machine Learning LogisticsTed Dunning
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Itai Yaffe
 
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!TigerGraph
 
Deep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the EnterpriseDeep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the EnterpriseGanesan Narayanasamy
 
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016Mathieu Dumoulin
 
Predictive Analytics with Hadoop
Predictive Analytics with HadoopPredictive Analytics with Hadoop
Predictive Analytics with HadoopDataWorks Summit
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainMapR Technologies
 

Mais procurados (20)

Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
 
Applying Machine Learning to Live Patient Data
Applying Machine Learning to  Live Patient DataApplying Machine Learning to  Live Patient Data
Applying Machine Learning to Live Patient Data
 
How Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health CareHow Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health Care
 
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
 
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
 
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DBStructured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
 
MapR and Machine Learning Primer
MapR and Machine Learning PrimerMapR and Machine Learning Primer
MapR and Machine Learning Primer
 
Plume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis LibraryPlume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis Library
 
Predicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine LearningPredicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine Learning
 
Production Grade Data Science for Hadoop
Production Grade Data Science for HadoopProduction Grade Data Science for Hadoop
Production Grade Data Science for Hadoop
 
Advanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming DataAdvanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming Data
 
Big Data Analysis Starts with R
Big Data Analysis Starts with RBig Data Analysis Starts with R
Big Data Analysis Starts with R
 
Graph Databases and Machine Learning | November 2018
Graph Databases and Machine Learning | November 2018Graph Databases and Machine Learning | November 2018
Graph Databases and Machine Learning | November 2018
 
Machine Learning Logistics
Machine Learning LogisticsMachine Learning Logistics
Machine Learning Logistics
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?
 
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
 
Deep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the EnterpriseDeep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the Enterprise
 
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
 
Predictive Analytics with Hadoop
Predictive Analytics with HadoopPredictive Analytics with Hadoop
Predictive Analytics with Hadoop
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
 

Destaque

predictive maintenance
predictive maintenancepredictive maintenance
predictive maintenanceAmey Kulkarni
 
Database Maintenance Optimization Brad Mc Gehee
Database Maintenance Optimization   Brad Mc GeheeDatabase Maintenance Optimization   Brad Mc Gehee
Database Maintenance Optimization Brad Mc GeheePratik joshi
 
Maintenance and Management Best Practices from Support
Maintenance and Management Best Practices from SupportMaintenance and Management Best Practices from Support
Maintenance and Management Best Practices from SupportCA | Automic Software
 
Building a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to ZBuilding a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to ZCharles Vestur
 
Big Data Meetup: Data Science & Big Data in Telecom
Big Data Meetup: Data Science & Big Data in TelecomBig Data Meetup: Data Science & Big Data in Telecom
Big Data Meetup: Data Science & Big Data in TelecomProvectus
 

Destaque (7)

Myths of Data Science
Myths of Data ScienceMyths of Data Science
Myths of Data Science
 
predictive maintenance
predictive maintenancepredictive maintenance
predictive maintenance
 
Using hadoop for big data
Using hadoop for big dataUsing hadoop for big data
Using hadoop for big data
 
Database Maintenance Optimization Brad Mc Gehee
Database Maintenance Optimization   Brad Mc GeheeDatabase Maintenance Optimization   Brad Mc Gehee
Database Maintenance Optimization Brad Mc Gehee
 
Maintenance and Management Best Practices from Support
Maintenance and Management Best Practices from SupportMaintenance and Management Best Practices from Support
Maintenance and Management Best Practices from Support
 
Building a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to ZBuilding a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to Z
 
Big Data Meetup: Data Science & Big Data in Telecom
Big Data Meetup: Data Science & Big Data in TelecomBig Data Meetup: Data Science & Big Data in Telecom
Big Data Meetup: Data Science & Big Data in Telecom
 

Semelhante a ML Workshop 1: A New Architecture for Machine Learning Logistics

Streaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningStreaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningTed Dunning
 
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive
 
Machine Learning logistics
Machine Learning logisticsMachine Learning logistics
Machine Learning logisticsTed Dunning
 
DataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven OrganizationsDataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven OrganizationsEllen Friedman
 
Real-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in ActionReal-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in ActionDataWorks Summit
 
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataState of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataMathieu Dumoulin
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsMapR Technologies
 
Big Data LDN 2017: Real World Impact of a Global Data Fabric
Big Data LDN 2017: Real World Impact of a Global Data FabricBig Data LDN 2017: Real World Impact of a Global Data Fabric
Big Data LDN 2017: Real World Impact of a Global Data FabricMatt Stubbs
 
Surprising Advantages of Streaming - ACM March 2018
Surprising Advantages of Streaming - ACM March 2018Surprising Advantages of Streaming - ACM March 2018
Surprising Advantages of Streaming - ACM March 2018Ellen Friedman
 
Predictive Maintenance Using Recurrent Neural Networks
Predictive Maintenance Using Recurrent Neural NetworksPredictive Maintenance Using Recurrent Neural Networks
Predictive Maintenance Using Recurrent Neural NetworksJustin Brandenburg
 
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...Mathieu Dumoulin
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Technologies
 
Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Carol McDonald
 
Big Data LDN 2017: How to leverage the cloud for Business Solutions
Big Data LDN 2017: How to leverage the cloud for Business SolutionsBig Data LDN 2017: How to leverage the cloud for Business Solutions
Big Data LDN 2017: How to leverage the cloud for Business SolutionsMatt Stubbs
 
Spark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating ExampleSpark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating ExampleIan Downard
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...MapR Technologies
 
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...Matt Stubbs
 
Deep Learning vs. Cheap Learning
Deep Learning vs. Cheap LearningDeep Learning vs. Cheap Learning
Deep Learning vs. Cheap LearningMapR Technologies
 

Semelhante a ML Workshop 1: A New Architecture for Machine Learning Logistics (20)

Streaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningStreaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine Learning
 
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
 
Machine Learning logistics
Machine Learning logisticsMachine Learning logistics
Machine Learning logistics
 
DataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven OrganizationsDataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven Organizations
 
Real-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in ActionReal-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in Action
 
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataState of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
 
Big Data LDN 2017: Real World Impact of a Global Data Fabric
Big Data LDN 2017: Real World Impact of a Global Data FabricBig Data LDN 2017: Real World Impact of a Global Data Fabric
Big Data LDN 2017: Real World Impact of a Global Data Fabric
 
Surprising Advantages of Streaming - ACM March 2018
Surprising Advantages of Streaming - ACM March 2018Surprising Advantages of Streaming - ACM March 2018
Surprising Advantages of Streaming - ACM March 2018
 
Predictive Maintenance Using Recurrent Neural Networks
Predictive Maintenance Using Recurrent Neural NetworksPredictive Maintenance Using Recurrent Neural Networks
Predictive Maintenance Using Recurrent Neural Networks
 
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
 
T digest-update
T digest-updateT digest-update
T digest-update
 
Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1
 
Big Data LDN 2017: How to leverage the cloud for Business Solutions
Big Data LDN 2017: How to leverage the cloud for Business SolutionsBig Data LDN 2017: How to leverage the cloud for Business Solutions
Big Data LDN 2017: How to leverage the cloud for Business Solutions
 
Spark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating ExampleSpark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating Example
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
 
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
 
Streaming in the Extreme
Streaming in the ExtremeStreaming in the Extreme
Streaming in the Extreme
 
Deep Learning vs. Cheap Learning
Deep Learning vs. Cheap LearningDeep Learning vs. Cheap Learning
Deep Learning vs. Cheap Learning
 

Mais de MapR Technologies

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscapeMapR Technologies
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsMapR Technologies
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageMapR Technologies
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformMapR Technologies
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareMapR Technologies
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data AnalyticsMapR Technologies
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsMapR Technologies
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR Technologies
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLMapR Technologies
 
Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0MapR Technologies
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications MapR Technologies
 
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR Technologies
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR Technologies
 
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceHandling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceMapR Technologies
 
Baptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataBaptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataMapR Technologies
 
The Keys to Digital Transformation
The Keys to Digital TransformationThe Keys to Digital Transformation
The Keys to Digital TransformationMapR Technologies
 
Insight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital TransformationInsight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital TransformationMapR Technologies
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast DataMapR Technologies
 

Mais de MapR Technologies (19)

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscape
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIs
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQL
 
Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications
 
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data Platform
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
 
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceHandling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in Finance
 
Baptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataBaptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big Data
 
The Keys to Digital Transformation
The Keys to Digital TransformationThe Keys to Digital Transformation
The Keys to Digital Transformation
 
Insight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital TransformationInsight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital Transformation
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast Data
 

Último

Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879
 

Último (20)

Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 

ML Workshop 1: A New Architecture for Machine Learning Logistics

  • 1. © 2017 MapR Technologies 1 Machine Learning Model Management The working of the rendezvous framework
  • 2. © 2017 MapR Technologies 2 Contact Information Ted Dunning, PhD Chief Application Architect, MapR Technologies Committer, PMC member, board member, ASF O’Reilly author Email tdunning@mapr.com tdunning@apache.org Twitter @Ted_Dunning
  • 3. © 2017 MapR Technologies 3 Traditional View
  • 4. © 2017 MapR Technologies 4 Traditional View: This isn’t the whole story
  • 5. © 2017 MapR Technologies 5 90% of the effort in successful machine learning isn’t in the training or model dev… It’s the logistics
  • 6. © 2017 MapR Technologies 6 Rendezvous Architecture Input Scores RendezvousModel 1 Model 2 Model 3 request response Results
  • 7. © 2017 MapR Technologies 7 What We Ultimately Want request response Model
  • 8. © 2017 MapR Technologies 8 But This Isn’t The Answer Model 1 request response Load balancer Model 2 Model 3
  • 9. © 2017 MapR Technologies 9 First Try with Streams Input Model 1 Model 2 Model 3 request response ?
  • 10. © 2017 MapR Technologies 10 First Rendezvous Input Scores RendezvousModel 1 Model 2 Model 3 request response Results
  • 11. © 2017 MapR Technologies 11 Some Key Points • Note that all models see identical inputs • All models run in production setting • All models send scores to same stream • The rendezvous server decides which scores to ignore • Roll forward, roll back, correlated comparison are all now trivial
  • 12. © 2017 MapR Technologies 12 Reality Check, Injecting External State Model 1 Model 2 Model 3 request Raw Add external data Input Database The world
  • 13. © 2017 MapR Technologies 13 Recording Raw Data (as it really was) Input Scores Decoy Model 2 Model 3 Archive
  • 14. © 2017 MapR Technologies 14 Quality & Reproducibility of Input Data is Important! • Recording raw-ish data is really a big deal – Data as seen by a model is worth gold – Data reconstructed later often has time-machine leaks – Databases were made for updates, streams are safer • Raw data is useful for non-ML cases as well (think flexibility) • Decoy model records training data as seen by models under development & evaluation
  • 15. © 2017 MapR Technologies 15 Canary for Comparison Real model ∆ Result Canary Decoy Archive Input
  • 16. © 2017 MapR Technologies 16 What Does the Canary Do? • The canary is a real model, but is very rarely updated • The canary results are almost never used for decisioning • The virtue of the canary is stability • Comparing to the canary results gives insight into new models
  • 17. © 2017 MapR Technologies 17 Isolated Development With Stream Replication Model 1 Model 2 Model 3 request Raw Add external data Input Internal 1 Internal 2 Internal 3 The world Model 4 Raw New external data Input Internal 4 Production Development
  • 18. © 2017 MapR Technologies 18 Scores ArchiveDecoy m1 m2 m3 Features / profiles InputRaw
  • 19. © 2017 MapR Technologies 19 ResultsRendezvousScores ArchiveDecoy m1 m2 m3 Features / profiles InputRaw
  • 20. © 2017 MapR Technologies 20 Metrics Metrics ResultsRendezvousScores ArchiveDecoy m1 m2 m3 Features / profiles InputRaw
  • 21. © 2017 MapR Technologies 21 Some Details • Inside the rendezvous server – Message contents … highlight return address – Rendezvous mailbox – Schedule ideas • Inside a model container – Identical inputs makes scaling easy – Nearly stateless models – Streaming shims, latency rig
  • 22. © 2017 MapR Technologies 22 Message Content • Input request contains request data plus administrivia { timestamp: 1501020498314, messageId: "2a5f2b61fdd848d7954a51b49c2a9e2c", return: "proxy-217" provenance: { ... }, diagnostics: { ... }, ... application specific data here .. }
  • 23. © 2017 MapR Technologies 23 Rendezvous Schedules • Simple part – Up to deadline, accept preferred models – Up to next deadline, accept more models – Near final deadline, accept default answer • But also some probabilistic choice • And also consider external experimental control – Inject as external state – Use in rendezvous to select model result – Open question how much power to expose
  • 24. © 2017 MapR Technologies 24 The rendezvous server is simpler than it looks at first
  • 25. © 2017 MapR Technologies 25 Model Life Cycle • Developer / modeler produces container spec – And uses this to build their development article • QA inspects container spec – And uses this to build a test article • Security inspects container spec – And uses this to build final artifact • Important to use tools like Grafeas to inspect supply chain http://bit.ly/grafeas • Important that each step be inspectable
  • 26. © 2017 MapR Technologies 26 Almost all of the framework scales by trivial parallelism
  • 27. © 2017 MapR Technologies 27 Scaling Up • Note about streams – At millions of updates per server, the streams aren’t part of the streaming question • Scaling up state injection – Partition raw input, replicate state injector – Beware external throughput limits – State injection does avoid duplicate queries • Scaling up models – Stateless models allow trivial scaling – Sequence state typically also trivial to scale • Scaling up the rendezvous – Match partition on raw and scores – Replicate trivially
  • 28. © 2017 MapR Technologies 28 Metrics Metrics ResultsRendezvousScores ArchiveDecoy m1 m2 m3 Features / profiles InputRaw
  • 29. © 2017 MapR Technologies 29 Metrics Metrics ResultsRendezvousScores ArchiveDecoy m1 m2 m3 Features / profiles InputRaw
  • 30. © 2017 MapR Technologies 30 Metrics Metrics ResultsRendezvousScores ArchiveDecoy m1 m2 m3 Features / profiles InputRaw
  • 31. © 2017 MapR Technologies 31 Metrics Metrics ResultsRendezvousScores ArchiveDecoy m1 m2 m3 Features / profiles InputRaw
  • 32. © 2017 MapR Technologies 32 In-place update of the framework via modified Chandry-Lamport
  • 33. © 2017 MapR Technologies 33 Transition Message Input Features / profiles Raw
  • 34. © 2017 MapR Technologies 34 Transition Message Features / profiles Input Features / profiles Raw
  • 35. © 2017 MapR Technologies 35 Transition Message Features / profiles Features / profiles InputRaw
  • 36. © 2017 MapR Technologies 36 Summary: This is easy-ish
  • 37. © 2017 MapR Technologies 37 Summary: This is easy-ish
  • 38. © 2017 MapR Technologies 38 Summary: This is easy-ish Well, it isn’t real hard
  • 39. © 2017 MapR Technologies 39 First Rendezvous Input Scores RendezvousModel 1 Model 2 Model 3 request response Results
  • 40. © 2017 MapR Technologies 40 Additional Resources O’Reilly report by Ted Dunning & Ellen Friedman © March 2017 Read free courtesy of MapR: https://mapr.com/geo-distribution-big-data-and-analytics/ O’Reilly book by Ted Dunning & Ellen Friedman © March 2016 Read free courtesy of MapR: https://mapr.com/streaming-architecture-using- apache-kafka-mapr-streams/
  • 41. © 2017 MapR Technologies 41 Additional Resources O’Reilly book by Ted Dunning & Ellen Friedman © June 2014 Read free courtesy of MapR: https://mapr.com/practical-machine-learning- new-look-anomaly-detection/ O’Reilly book by Ellen Friedman & Ted Dunning © February 2014 Read free courtesy of MapR: https://mapr.com/practical-machine-learning/
  • 42. © 2017 MapR Technologies 42 Additional Resources by Ellen Friedman 8 Aug 2017 on MapR blog: https://mapr.com/blog/tensorflow-mxnet-caffe-h2o-which-ml-best/ by Ted Dunning 13 Sept 2017 in InfoWorld: https://www.infoworld.com/article/3223 688/machine-learning/machine- learning-skills-for-software- engineers.html
  • 43. © 2017 MapR Technologies 43 New book: Machine Learning Logistics Model Management in the Real World O’Reilly book by Ellen Friedman & Ted Dunning © Sept 2017 Download free from MapR http://info.mapr.com/2017_Content_Machine-Learning- Logistics_eBook_Prereg_RegistrationPage.html Going to Strata Data NYC? Book will be released 26 Sept 2017: Visit MapR booth for free book signings or to talk about logistics
  • 44. © 2017 MapR Technologies 44 Please support women in tech – help build girls’ dreams of what they can accomplish © Ellen Friedman 2015#womenintech #datawomen
  • 45. © 2017 MapR Technologies 45 Q&A @mapr tdunning@mapr.com ENGAGE WITH US @ Ted_Dunning