Managing the Complete Machine Learning Lifecycle with MLflow

Standardising machine
learning lifecycle on Mlflow
Thunder Shiviah & Michael Shtelma
SAIS EUROPE - 2019 1

ML
Code
Configuration
Data Collection
Data
Verification
Feature
Extraction
Machine
Resource
Management
Analysis Tools
Process
Management Tools
Serving
Infrastructure
Monitoring
“Hidden Technical Debt in Machine Learning Systems,” Google NeurIPS 2015
Figure 1: Only a small fraction of real-world ML systems is composed of the ML code, as shown by the
small green box in the middle. The required surrounding infrastructure is vast and complex.
Hardest Part of ML isn’t ML, it’s Data

DATA
ENGINEERS
x
Data & ML Tech and People are in Silos
DATA
SCIENTISTS

ML Lifecycle is Manual, Inconsistent
and Disconnected
● Ad hoc approach to track
experiments
● Very hard to reproduce
experiments
Prep Data
● Multiple tightly coupled
deployment options
● Different monitoring approach
for each framework
Build Model Deploy Model
● Lowlevel integrations for
Data and ML
● Difficult to track data used
for a model

What is MLflow ?
Unveiled in June 2018, MLflow is an open source framework to manage the
complete Machine Learning Lifecycle.
Tracking
Record and query
experiments: code,
data, config, results
Projects
Packaging format
for reproducible runs
on any platform
Models
General model format
that supports diverse
deployment tools

Rapid Community Adoption
● Time till 74 contributors: Spark = 3 years; MLflow = 8 months

Notebooks
Local Apps
Cloud Jobs
Tracking Server
UI
API
MLflow Tracking
Python or
REST API

Key Concepts in Tracking
• Parameters: key-value inputs to your code
• Metrics: numeric values (can update over time)
• Artifacts: arbitrary files, including models
• Source: what code ran?

Experiment Tracking with Managed MLflow
Record runs, and keep track of
models parameters, results, code,
and data from each experiment
in one place.
Provides:
● Pre-configured MLflow tracking server
● Databricks Workspace & Notebooks UI integration
● S3, Azure Blob Storage, Google Cloud for artifacts storage
● Experiments management via role based Access Control Lists (ACLs)

Project Spec
Code DataConfig
Local Execution
Remote Execution
MLflow Projects

Example MLflow Project
my_project/
├── MLproject
│
│
│
│
│
├── conda.yaml
├── main.py
└── model.py
...
conda_env: conda.yaml
entry_points:
main:
parameters:
training_data: path
lambda: {type: float, default: 0.1}
command: python main.py {training_data} {lambda}
$ mlflow run git://<my_project>
mlflow.run(“git://<my_project>”, ...)

Reproducible Projects with Managed MLflow
Build composable projects,
capture dependencies and code
history for reproducible results,
and share projects with peers.
Provides:
● Support for Git, Conda, and
other file storage systems
● Remote execution via command line as a Databricks Job

Model Format
Flavor 2Flavor 1
Run Sources
Inference Code
Batch & Stream Scoring
Cloud Serving Tools
MLflow Models
Simple model flavors
usable by many tools

Example MLflow Model
my_model/
├── MLmodel
│
│
│
│
│
└── estimator/
├── saved_model.pb
└── variables/
...
Usable by tools that understand
TensorFlow model format
Usable by any tool that can run
Python (Docker, Spark, etc!)
run_id: 769915006efd4c4bbd662461
time_created: 2018-06-28T12:34
flavors:
tensorflow:
saved_model_dir: estimator
signature_def_key: predict
python_function:
loader_module: mlflow.tensorflow

Model Deployment with Managed MLflow
Quickly deploy models to any
platform based on your needs,
locally or in the cloud, from
experimentation to production.
Supports:
● Databricks Jobs and Clusters for
Production Model Operations
● Batch inference on Databricks (Apache Spark)
● REST endpoints via Docker containers, Azure ML, or SageMaker

Multi-step workflow GUI
https://databricks.com/sparkaisummit/north-america/2019-spark-summit-ai-keynotes-2#keynote-e

Model registry & deployment tracking
https://databricks.com/sparkaisummit/north-america/2019-spark-summit-ai-keynotes-2#keynote-e

Managing the Complete Machine Learning Lifecycle with MLflow

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Managing the Complete Machine Learning Lifecycle with MLflow

Semelhante a Managing the Complete Machine Learning Lifecycle with MLflow (20)

Mais de Databricks

Mais de Databricks (20)

Último

Último (20)

Managing the Complete Machine Learning Lifecycle with MLflow