SlideShare uma empresa Scribd logo
1 de 33
A platform for the
Complete Machine
Learning Lifecycle
Corey Zumar
June 24th, 2019
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
mlflow-databricks/
Presented at QCon New York
www.qconnewyork.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
2
Outline
• Overview of ML development challenges
• How MLflow tackles these challenges
• MLflow components
• Demo
• How to get started
3
Machine Learning
Development is Complex
4
ML Lifecycle
Data Prep
Training
Deploy
Raw Data
μ
λ θ Tunin
g
Scal
e
μ
λ θ Tunin
g
Scal
e
Scal
e
Scal
e
Model
Exchang
e
Governanc
e
Delta
5
Custom ML Platforms
Facebook FBLearner, Uber Michelangelo, Google TFX
+Standardize the data prep / training / deploy loop:
if you work with the platform, you get these!
–Limited to a few algorithms or frameworks
–Tied to one company’s infrastructure
Can we provide similar benefits in an open
manner?
6
Introducing
Open machine learning platform
• Works with any ML library & language
• Runs the same way anywhere (e.g. any cloud)
• Designed to be useful for 1 or 1000+ person orgs
7
MLflow Components
Tracking
Record and query
experiments: code,
configs, results,
…etc
Project
s
Packaging format
for reproducible
runs on any
platform
Model
s
General model
format
that supports diverse
deployment tools
8
Key Concepts in Tracking
Parameters: key-value inputs to your code
Metrics: numeric values (can update over time)
Source: training code that ran
Version: version of the training code
Artifacts: files, including data and models
Tags and Notes: any additional information
9
MLflow Tracking
Tracking Server
UI
API
Tracking APIs
(REST, Python, Java, R)
10
MLflow Tracking
Tracking
Record and query
experiments: code,
configs, results,
…etc
import mlflow
with mlflow.start_run():
mlflow.log_param("layers", layers)
mlflow.log_param("alpha", alpha)
# train model
mlflow.log_metric("mse", model.mse())
mlflow.log_artifact("plot", model.plot(test_df))
mlflow.tensorflow.log_model(model)
11
Demo
Goal: Classify hand-drawn digits
1. Instrument Keras training code with MLflow tracking APIs
2. Run training code as an MLflow Project
3. Deploy an MLflow Model for real-time serving
12
MLflow backend stores
1. Entity (Metadata) Store
• FileStore (local filesystem)
• SQLStore (via SQLAlchemy)
• REST Store
2. Artifact Store
• S3 backed store
• Azure Blob storage
• Google Cloud storage
• DBFS artifact repo
13
MLflow Components
Tracking
Record and query
experiments: code,
configs, results,
…etc
Project
s
Packaging format
for reproducible
runs on any
platform
Model
s
General model
format
that supports diverse
deployment tools
14
MLflow Projects Motivation
Diverse set of training
tools
Diverse set of
environments
Challenge:
ML results are
difficult to
reproduce.
15
Project Spec
Code
Data
Config
Local Execution
Remote Execution
MLflow Projects
Dependencie
s
16
MLflow Projects
Packaging format for reproducible ML runs
• Any code folder or GitHub repository
• Optional MLproject file with project configuration
Defines dependencies for reproducibility
• Conda (+ R, Docker, …) dependencies can be specified in MLproject
• Reproducible in (almost) any environment
Execution API for running projects
• CLI / Python / R / Java
• Supports local and remote execution
17
Example MLflow Project
my_project/
├── MLproject
│
│
│
│
│
├── conda.yaml
├── main.py
└── model.py
...
conda_env: conda.yaml
entry_points:
main:
parameters:
training_data: path
lambda: {type: float, default: 0.1}
command: python main.py {training_data}
{lambda}
$ mlflow run git://<my_project>
18
Demo
Goal: Classify hand-drawn digits
1. Instrument Keras training code with MLflow tracking APIs
2. Run training code as an MLflow Project
3. Deploy an MLflow Model for real-time serving
19
MLflow Components
Tracking
Record and query
experiments: code,
configs, results,
…etc
Project
s
Packaging format
for reproducible
runs on any
platform
Model
s
General model
format
that supports diverse
deployment tools
20
Inference
Code
Batch & Stream
Scoring
Serving Tools
Mlflow Models Motivation
ML
Frameworks
21
Model Format
Flavor 2Flavor 1
ML
Frameworks
Inference Code
Batch & Stream
Scoring
Serving Tools
Standard for ML
models
MLflow Models
22
MLflow Models
Packaging format for ML Models
• Any directory with MLmodel file
Defines dependencies for reproducibility
• Conda environment can be specified in MLmodel configuration
Model creation utilities
• Save models from any framework in MLflow format
Deployment APIs
• CLI / Python / R / Java
23
Example MLflow Model
my_model/
├── MLmodel
│
│
│
│
│
└ estimator/
├─ saved_model.pb
└─ variables/
...
Usable with Tensorflow
tools / APIs
Usable with any Python
tool
run_id: 769915006efd4c4bbd662461
time_created: 2018-06-28T12:34
flavors:
tensorflow:
saved_model_dir: estimator
signature_def_key: predict
python_function:
loader_module:
mlflow.tensorflow
mlflow.tensorflow.log_model(...)
24
Model Flavors Example
Train a model
mlflow.keras.log_model()
Model
Format
Flavor 1:
Pyfunc
Flavor 2:
Keras
predict = mlflow.pyfunc.load_pyfunc(…)
predict(input_dataframe)
model = mlflow.keras.load_model(…)
model.predict(keras.Input(…))
25
Model Flavors Example
predict = mlflow.pyfunc.load_pyfunc(…)
predict(input_dataframe)
26
Demo
Goal: Classify hand-drawn digits
1. Instrument Keras training code with MLflow tracking APIs
2. Run training code as an MLflow Project
3. Deploy an MLflow Model for real-time serving
27
1.0 Release
MLflow 1.0 was released recently! Major features:
•New metrics UI
•“Step” axis for metrics
•Improved search capabilities
•Package MLflow Models as Docker containers
•Support for ONNX models
27
28
Ongoing MLflow Roadmap
• New component: Model Registry for model management
• Multi-step project workflows
• Fluent Tracking API for Java and Scala
• Packaging projects with build steps
• Better environment isolation when loading models
• Improved model input/output schemas
29
Get started with MLflow
pip install mlflow to get started
Find docs & examples at mlflow.org
tinyurl.com/mlflow-slack
29
30
Thank you!
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
mlflow-databricks/

Mais conteúdo relacionado

Mais de C4Media

Mais de C4Media (20)

Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery Teams
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in Adtech
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/await
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven Utopia
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL Database
 
A Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinA Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with Brooklin
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 

MLflow: An Open Platform to Simplify the Machine Learning Lifecycle

  • 1. A platform for the Complete Machine Learning Lifecycle Corey Zumar June 24th, 2019
  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ mlflow-databricks/
  • 3. Presented at QCon New York www.qconnewyork.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  • 4. 2 Outline • Overview of ML development challenges • How MLflow tackles these challenges • MLflow components • Demo • How to get started
  • 6. 4 ML Lifecycle Data Prep Training Deploy Raw Data μ λ θ Tunin g Scal e μ λ θ Tunin g Scal e Scal e Scal e Model Exchang e Governanc e Delta
  • 7. 5 Custom ML Platforms Facebook FBLearner, Uber Michelangelo, Google TFX +Standardize the data prep / training / deploy loop: if you work with the platform, you get these! –Limited to a few algorithms or frameworks –Tied to one company’s infrastructure Can we provide similar benefits in an open manner?
  • 8. 6 Introducing Open machine learning platform • Works with any ML library & language • Runs the same way anywhere (e.g. any cloud) • Designed to be useful for 1 or 1000+ person orgs
  • 9. 7 MLflow Components Tracking Record and query experiments: code, configs, results, …etc Project s Packaging format for reproducible runs on any platform Model s General model format that supports diverse deployment tools
  • 10. 8 Key Concepts in Tracking Parameters: key-value inputs to your code Metrics: numeric values (can update over time) Source: training code that ran Version: version of the training code Artifacts: files, including data and models Tags and Notes: any additional information
  • 12. 10 MLflow Tracking Tracking Record and query experiments: code, configs, results, …etc import mlflow with mlflow.start_run(): mlflow.log_param("layers", layers) mlflow.log_param("alpha", alpha) # train model mlflow.log_metric("mse", model.mse()) mlflow.log_artifact("plot", model.plot(test_df)) mlflow.tensorflow.log_model(model)
  • 13. 11 Demo Goal: Classify hand-drawn digits 1. Instrument Keras training code with MLflow tracking APIs 2. Run training code as an MLflow Project 3. Deploy an MLflow Model for real-time serving
  • 14. 12 MLflow backend stores 1. Entity (Metadata) Store • FileStore (local filesystem) • SQLStore (via SQLAlchemy) • REST Store 2. Artifact Store • S3 backed store • Azure Blob storage • Google Cloud storage • DBFS artifact repo
  • 15. 13 MLflow Components Tracking Record and query experiments: code, configs, results, …etc Project s Packaging format for reproducible runs on any platform Model s General model format that supports diverse deployment tools
  • 16. 14 MLflow Projects Motivation Diverse set of training tools Diverse set of environments Challenge: ML results are difficult to reproduce.
  • 17. 15 Project Spec Code Data Config Local Execution Remote Execution MLflow Projects Dependencie s
  • 18. 16 MLflow Projects Packaging format for reproducible ML runs • Any code folder or GitHub repository • Optional MLproject file with project configuration Defines dependencies for reproducibility • Conda (+ R, Docker, …) dependencies can be specified in MLproject • Reproducible in (almost) any environment Execution API for running projects • CLI / Python / R / Java • Supports local and remote execution
  • 19. 17 Example MLflow Project my_project/ ├── MLproject │ │ │ │ │ ├── conda.yaml ├── main.py └── model.py ... conda_env: conda.yaml entry_points: main: parameters: training_data: path lambda: {type: float, default: 0.1} command: python main.py {training_data} {lambda} $ mlflow run git://<my_project>
  • 20. 18 Demo Goal: Classify hand-drawn digits 1. Instrument Keras training code with MLflow tracking APIs 2. Run training code as an MLflow Project 3. Deploy an MLflow Model for real-time serving
  • 21. 19 MLflow Components Tracking Record and query experiments: code, configs, results, …etc Project s Packaging format for reproducible runs on any platform Model s General model format that supports diverse deployment tools
  • 22. 20 Inference Code Batch & Stream Scoring Serving Tools Mlflow Models Motivation ML Frameworks
  • 23. 21 Model Format Flavor 2Flavor 1 ML Frameworks Inference Code Batch & Stream Scoring Serving Tools Standard for ML models MLflow Models
  • 24. 22 MLflow Models Packaging format for ML Models • Any directory with MLmodel file Defines dependencies for reproducibility • Conda environment can be specified in MLmodel configuration Model creation utilities • Save models from any framework in MLflow format Deployment APIs • CLI / Python / R / Java
  • 25. 23 Example MLflow Model my_model/ ├── MLmodel │ │ │ │ │ └ estimator/ ├─ saved_model.pb └─ variables/ ... Usable with Tensorflow tools / APIs Usable with any Python tool run_id: 769915006efd4c4bbd662461 time_created: 2018-06-28T12:34 flavors: tensorflow: saved_model_dir: estimator signature_def_key: predict python_function: loader_module: mlflow.tensorflow mlflow.tensorflow.log_model(...)
  • 26. 24 Model Flavors Example Train a model mlflow.keras.log_model() Model Format Flavor 1: Pyfunc Flavor 2: Keras predict = mlflow.pyfunc.load_pyfunc(…) predict(input_dataframe) model = mlflow.keras.load_model(…) model.predict(keras.Input(…))
  • 27. 25 Model Flavors Example predict = mlflow.pyfunc.load_pyfunc(…) predict(input_dataframe)
  • 28. 26 Demo Goal: Classify hand-drawn digits 1. Instrument Keras training code with MLflow tracking APIs 2. Run training code as an MLflow Project 3. Deploy an MLflow Model for real-time serving
  • 29. 27 1.0 Release MLflow 1.0 was released recently! Major features: •New metrics UI •“Step” axis for metrics •Improved search capabilities •Package MLflow Models as Docker containers •Support for ONNX models 27
  • 30. 28 Ongoing MLflow Roadmap • New component: Model Registry for model management • Multi-step project workflows • Fluent Tracking API for Java and Scala • Packaging projects with build steps • Better environment isolation when loading models • Improved model input/output schemas
  • 31. 29 Get started with MLflow pip install mlflow to get started Find docs & examples at mlflow.org tinyurl.com/mlflow-slack 29
  • 33. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ mlflow-databricks/