2. ML is Transforming All Major Industries
Healthcare
Logistics
Telecom
Government
Banking
High Tech Oil & Gas
Agriculture
Retail
Travel
3. But ML is Different from Traditional Software
Traditional Software
Goal: meet a functional
specification
Quality depends only on
application code
Pick one software stack
Machine Learning
Goal: optimize a metric
(e.g. prediction accuracy)
Quality depends on training data
and tuning parameters
Constantly evaluate and combine
new libraries for the same task
4. So Operating ML is Complex!
§ Many teams and systems involved
§ Constantly update data & metrics
§ Hard to move from development
to production environments
Data Prep
Training
Deployment
Raw Data
ML ENGINEER
APPLICATION
DEVELOPER
DATA
ENGINEER
5. So Operating ML is Complex!
§ Many teams and systems involved
§ Constantly update data & metrics
§ Hard to move from development
to production environments
Data Prep
Training
Deployment
Raw Data
ML ENGINEER
APPLICATION
DEVELOPER
DATA
ENGINEER
ML teams often spend >50% of time
maintaining existing models
6. Response: ML Platforms
Software to manage the ML development and operations process,
from data to experimentation to production
Examples: Google TFX, Facebook FBLearner, Uber Michelangelo,
MLflow
Typical functionality:
▪ Data management
▪ Experiment management
▪ Model management
▪ Deployment for inference
▪ Reproducibility
▪ Testing & monitoring
All through a
consistent interface!
8. Desirable Features for an ML Platform
1. Ease of adoption by data scientists, engineers, and model users
▪ How much work does it take to use? What ML libraries are supported? Etc.
9. Desirable Features for an ML Platform
1. Ease of adoption by data scientists, engineers, and model users
▪ How much work does it take to use? What ML libraries are supported? Etc.
2. Integration with data infrastructure to support data versioning,
monitoring, and governance across data pipeline & ML steps
10. Desirable Features for an ML Platform
1. Ease of adoption by data scientists, engineers, and model users
▪ How much work does it take to use? What ML libraries are supported? Etc.
2. Integration with data infrastructure to support data versioning,
monitoring, and governance across data pipeline & ML steps
3. Collaboration functions to enable sharing code, data, features,
experiments and models in a central place (securely!)
11. Our MLOps Approach in Databricks
§ Every org’s requirements will be different, and will change over time
§ Provide a general platform that is easy to integrate with diverse tools
Open source machine
learning platform
Transactional, versioned
data lake storage
Data science & ML workspace
12. In This Webinar
§ How we and other organizations perform MLOps at scale
§ Demos and experience from two customers
§ Live Q&A with presenters