ML development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models.
To solve for these challenges, Databricks unveiled last year MLflow, an open source project that aims at simplifying the entire ML lifecycle. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size.
In the past year, the MLflow community has grown quickly: over 120 contributors from over 40 companies have contributed code to the project, and over 200 companies are using MLflow.
In this tutorial, we will show you how using MLflow can help you:
Keep track of experiments runs and results across frameworks.
Execute projects remotely on to a Databricks cluster, and quickly reproduce your runs.
Quickly productionize models using Databricks production jobs, Docker containers, Azure ML, or Amazon SageMaker.
We will demo the building blocks of MLflow as well as the most recent additions since the 1.0 release.
What you will learn:
Understand the three main components of open source MLflow (MLflow Tracking, MLflow Projects, MLflow Models) and how each help address challenges of the ML lifecycle.
How to use MLflow Tracking to record and query experiments: code, data, config, and results.
How to use MLflow Projects packaging format to reproduce runs on any platform.
How to use MLflow Models general format to send models to diverse deployment tools.
Prerequisites:
A fully-charged laptop (8-16GB memory) with Chrome or Firefox
Python 3 and pip pre-installed
Pre-Register for a Databricks Standard Trial
Basic knowledge of Python programming language
Basic understanding of Machine Learning Concepts
4. ML Lifecycle is Manual, Inconsistent
and Disconnected
● Ad hoc approach to track
experiments
● Very hard to reproduce
experiments
Prep Data
● Multiple tightly coupled
deployment options
● Different monitoring approach
for each framework
Build Model Deploy Model
● Lowlevel integrations for
Data and ML
● Difficult to track data used
for a model
6. What is MLflow ?
Unveiled in June 2018, MLflow is an open source framework to manage the
complete Machine Learning Lifecycle.
Tracking
Record and query
experiments: code,
data, config, results
Projects
Packaging format
for reproducible runs
on any platform
Models
General model format
that supports diverse
deployment tools
11. Key Concepts in Tracking
• Parameters: key-value inputs to your code
• Metrics: numeric values (can update over time)
• Artifacts: arbitrary files, including models
• Source: what code ran?
12. Experiment Tracking with Managed MLflow
Record runs, and keep track of
models parameters, results, code,
and data from each experiment
in one place.
Provides:
● Pre-configured MLflow tracking server
● Databricks Workspace & Notebooks UI integration
● S3, Azure Blob Storage, Google Cloud for artifacts storage
● Experiments management via role based Access Control Lists (ACLs)
16. Reproducible Projects with Managed MLflow
Build composable projects,
capture dependencies and code
history for reproducible results,
and share projects with peers.
Provides:
● Support for Git, Conda, and
other file storage systems
● Remote execution via command line as a Databricks Job
18. Model Format
Flavor 2Flavor 1
Run Sources
Inference Code
Batch & Stream Scoring
Cloud Serving Tools
MLflow Models
Simple model flavors
usable by many tools
19. Example MLflow Model
my_model/
├── MLmodel
│
│
│
│
│
└── estimator/
├── saved_model.pb
└── variables/
...
Usable by tools that understand
TensorFlow model format
Usable by any tool that can run
Python (Docker, Spark, etc!)
run_id: 769915006efd4c4bbd662461
time_created: 2018-06-28T12:34
flavors:
tensorflow:
saved_model_dir: estimator
signature_def_key: predict
python_function:
loader_module: mlflow.tensorflow
20. Model Deployment with Managed MLflow
Quickly deploy models to any
platform based on your needs,
locally or in the cloud, from
experimentation to production.
Supports:
● Databricks Jobs and Clusters for
Production Model Operations
● Batch inference on Databricks (Apache Spark)
● REST endpoints via Docker containers, Azure ML, or SageMaker