SlideShare uma empresa Scribd logo
1 de 62
Baixar para ler offline
Building an ML
Platform with Ray and
MLflow
Amog Kamsetty and Archit Kulkarni
Ray Team @ Anyscale
The Team
Archit Kulkarni Amog Kamsetty Dmitri Gekhtman Edward Oakes
Richard Liaw Kai Fricke Simon Mo
Kathryn Zhou
Overview of Talk
▪ What are ML Platforms?
▪ Ray and its libraries
▪ MLflow
▪ Demo: An ML Platform
built with MLflow and
Ray
What are ML Platforms?
Typical ML Process
Fuzzy
search!
NLP, DL …
Execution
- Feature engineering
- Training
- Including tuning
- Serving
- Offline scoring, inference
- Online serving
Typical ML Process -- Simplified
Management
- Tracking
- Data, Code, Configurations
- Reproducing Results
- Deployment
- Deploy in a variety of
environments
Challenges with the ML Process
Data/Features
• Data Preparation
• Data Analysis
• Feature
Engineering
• Data Pipeline
• Data
Management/Feat
ure Store
• Manages big data
clusters
Model
• ML Expertise
• Implement SOTA
ML Research
• Experimentation
• Manage GPU
infrastructure
• Scalable training &
hyperparameter
tuning
Production
• A/B Testing
• Model Evaluation
• Analysis of
Predictions
• Deploy in variety of
environments
• CI/CD
• Highly Available
prediction service
Data/Research
Scientist
Engineers
Challenges with the ML Process
Data
• Data Preparation
• Data Analysis
• Feature
Engineering
• Data Pipeline
• Data
Management/Feat
ure Store
• Manages big data
clusters
Model
• ML Expertise
• Implement SOTA
ML Research
• Experimentation
• Manage GPU
infrastructure
• Scalable training &
hyperparameter
tuning
Production
• A/B Testing
• Model Evaluation
• Analysis of
Predictions
• Deploy in variety of
environments
• CI/CD
• Highly Available
prediction service
Data/Research
Scientist
Software/Data/
ML Engineer
ML Platform
Abstraction
ML Platforms -- Scale
- LinkedIn:
- 500+ “AI engineers” building models; 50+ MLP engineers
- > 50% offline compute demand (12K servers each with 256G RAM)
- More than 2x a year
- Uber Michelangelo, AirBnB Bighead, Facebook FBLearner,
etc.
- Globally, a few Billion $ now, growing 40%+ YoY
- Many companies building ML Platforms from the ground up
ML Platforms -- Landscape
(Source: Intel Capital)
ML Platforms -- Landscape
(Source: Intel Capital)
Execution
- Feature engineering 🔪
- Training 🍳
- Including tuning 🧂
- Serving 🍽
- Offline scoring, inference
- Online serving
Typical ML Process -- Simplified
Management
- Tracking 📝
- Data, Code, Configurations
- Reproducing Results 📖
- Deployment 🚚 💻
- Deploy in a variety of
environments
Execution
- Feature engineering 🔪
- Training 🍳
- Including tuning 🧂
- Serving 🍽
- Offline scoring, inference
- Online serving
Typical ML Process -- Simplified
Management
- Tracking 📝
- Data, Code, Configurations
- Reproducing Results 📖
- Deployment 🚚 💻
- Variety of environments
Ray and its Libraries
What is Ray?
• A simple/general library for distributed computing
• Single machine or 100s of nodes
• Agnostic to the type of work
• An ecosystem of libraries (for scaling ML and more)
• Native: Ray RLlib, Ray Tune, Ray Serve
• Third party: Modin, Dask, Horovod, XGBoost, Pytorch Lightning
• Tools for launching clusters on any cloud provider
Three key ideas
Execute remote functions as tasks, and
instantiate remote classes as actors
• Support both stateful and stateless computations
Asynchronous execution using futures
• Enable parallelism
Distributed (immutable) object store
• Efficient communication (send arguments by reference)
Ray API
API
Functions -> Tasks
def read_array(file):
# read array “a” from “file”
return a
def add(a, b):
return np.add(a, b)
API
Functions -> Tasks
@ray.remote
def read_array(file):
# read array “a” from “file”
return a
@ray.remote
def add(a, b):
return np.add(a, b)
API
Functions -> Tasks
@ray.remote
def read_array(file):
# read array “a” from “file”
return a
@ray.remote
def add(a, b):
return np.add(a, b)
id1 = read_array.remote(“/input1”)
id1
read_array
API
Functions -> Tasks
@ray.remote
def read_array(file):
# read array “a” from “file”
return a
@ray.remote
def add(a, b):
return np.add(a, b)
id1 = read_array.remote(“/input1”)
id2 = read_array.remote(“/input2”)
id1
read_array
id2
zeros
read_array
API
Functions -> Tasks
@ray.remote
def read_array(file):
# read array “a” from “file”
return a
@ray.remote
def add(a, b):
return np.add(a, b)
id1 = read_array.remote(“/input1”)
id2 = read_array.remote(“/input2”)
id3 = add.remote(id1, id2)
id1
read_array
id2
zeros
read_array
id3
add
API
Functions -> Tasks
@ray.remote
def read_array(file):
# read array “a” from “file”
return a
@ray.remote
def add(a, b):
return np.add(a, b)
id1 = read_array.remote(“/input1”)
id2 = read_array.remote(“/input2”)
id3 = add.remote(id1, id2); ray.get(id3)
id1
read_array
id2
zeros
read_array
id3
add
API
Functions -> Tasks
@ray.remote
def read_array(file):
# read array “a” from “file”
return a
@ray.remote
def add(a, b):
return np.add(a, b)
id1 = read_array.remote(“/input1”)
id2 = read_array.remote(“/input2”)
id3 = add.remote(id1, id2)
Classes -> Actors
API
Functions -> Tasks
@ray.remote
def read_array(file):
# read array “a” from “file”
return a
@ray.remote
def add(a, b):
return np.add(a, b)
id1 = read_array.remote(“/input1”)
id2 = read_array.remote(“/input2”)
id3 = add.remote(id1, id2)
Classes -> Actors
@ray.remote
class Counter(object):
def __init__(self):
self.value = 0
def inc(self):
self.value += 1
return self.value
API
Functions -> Tasks
@ray.remote
def read_array(file):
# read array “a” from “file”
return a
@ray.remote
def add(a, b):
return np.add(a, b)
id1 = read_array.remote(“/input1”)
id2 = read_array.remote(“/input2”)
id3 = add.remote(id1, id2)
Classes -> Actors
@ray.remote
class Counter(object):
def __init__(self):
self.value = 0
def inc(self):
self.value += 1
return self.value
c = Counter.remote()
id4 = c.inc.remote()
id5 = c.inc.remote()
ray.get([id4, id5])
API
Functions -> Tasks
@ray.remote
def read_array(file):
# read array “a” from “file”
return a
@ray.remote(num_gpus=1)
def add(a, b):
return np.add(a, b)
id1 = read_array.remote(“/input1”)
id2 = read_array.remote(“/input2”)
id3 = add.remote(id1, id2)
Classes -> Actors
@ray.remote(num_gpus=1)
class Counter(object):
def __init__(self):
self.value = 0
def inc(self):
self.value += 1
return self.value
c = Counter.remote()
id4 = c.inc.remote()
id5 = c.inc.remote()
ray.get([id4, id5])
at Anyscale
Your app
here!
Native Libraries 3rd Party Libraries
Ecosystem
Universal framework for
Distributed computing
Ray Ecosystem
Ray Tune
Ray Tune: Scalable
Hyperparameter Tuning
Wide variety of algorithms Compatible with ML frameworks
HYPERBAND
PBT
BAYESIAN OPT.
Ray Tune focuses on
simplifying execution
Easily launch distributed multi-gpu
tuning jobs
Automatic fault tolerance to save
3x on GPU costs
https://www.vecteezy.com/
$ ray up {cluster config}
ray.init(address="auto")
tune.run(func, num_samples=100)
Ray Tune interoperates
with other HPO libraries
Ray Tune
Ax
Optuna
scikit-optimize
…
def train_model(config={}):
model = ConvNet(config)
for i in range(steps):
current_loss = model.train()
from ray import tune
def train_model(config={}):
model = ConvNet(config)
for i in range(steps):
current_loss = model.train()
tune.report(loss=current_loss)
def train_model(config):
model = ConvNet(config)
for i in range(epochs):
current_loss = model.train()
tune.report(loss=current_loss)
tune.run(train_model,
config={“lr”: 0.1})
tune.run(
train_model,
config={“lr”: tune.uniform(0.001, 0.1)},
num_samples=100
)
def train_model(config):
model = ConvNet(config)
for i in range(epochs):
current_loss = model.train()
tune.report(loss=current_loss)
tune.run(
train_model,
config={“lr”: tune.uniform(0.001, 0.1)},
num_samples=100,
scheduler=ASHAScheduler())
def train_model(config):
model = ConvNet(config)
for i in range(epochs):
current_loss = model.train()
tune.report(loss=current_loss)
tune.run(
train_model,
config={“lr”: tune.uniform(0.001, 0.1)},
num_samples=100,
scheduler=PopulationBasedTraining(...))
def train_model(config, checkpoint_dir=None):
model = ConvNet(config)
if checkpoint_dir is not None:
model.load_checkpoint(checkpoint_dir+”model.pt”)
for i in range(epochs):
current_loss = model.train()
with tune.checkpoint_dir() as dir:
model.save_checkpoint(dir+”model.pt”)
tune.report(loss=current_loss)
Ray Serve
Ray Serve is a
Web Framework
Built for
Model Serving
Model Serving in Python
Ray Serve is
high-performance and flexible
• Framework-agnostic
• Easily scales
• Supports batching
• Query your endpoints from
HTTP and from Python
• Easily integrate with other
tools
Ray Serve is built on top of Ray
For user, no need to think about:
• Interprocess communication
• Failure management
• Scheduling
Just tell Ray Serve to scale up your model.
Serve functions and stateful classes.
Ray Serve will use multiple replicas to parallelize
across cores and across nodes in your cluster.
Ray Serve API
Flexibility
Query your model from HTTP:
> curl "http://127.0.0.1:8000/my/route"
Or query from Python using ServeHandle:
MLflow
Challenges of ML in production
• It’s difficult to keep track of experiments.
• It’s difficult to reproduce code.
• There’s no standard way to package and deploy
models.
• There’s no central store to manage models (their
versions and stage transitions).
Source: mlflow.org
What is MLflow?
• Open-source ML lifecycle management tool
• Single solution for all of the above challenges
• Library-agnostic and language-agnostic
• (Works with your existing code)
Four key functions of MLflow
Source: MLflow
MLflow Tracking
MLflow Models
Ray + MLflow
Ray Tune + MLflow Tracking
def train_model(config):
model = ConvNet(config)
for i in range(epochs):
current_loss = model.train()
tune.report(loss=current_loss)
tune.run(
train_model,
config={“lr”: tune.uniform(0.001, 0.1)},
num_samples=100,
callbacks=[MLflowLoggerCallback(“my_experiment”)])
Ray Tune + MLflow Tracking
@mlflow_mixin
def train_model(config):
mlflow.autolog()
xgboost_results = xgb.train(config, ...)
tune.run(
train_model,
config={“lr”: tune.uniform(0.001, 0.1)},
num_samples=100)
+
> pip install mlflow-ray-serve
> ray start --head
> serve start
MLflow deployments CLI
Create deployment
> mlflow deployments create -t ray-serve -m <model URI>
--name my_model -C num_replicas=100
Model URI:
• models:/MyModel/1
• runs:/93203689db9c4b50afb6869
• s3://<bucket>/<path>
• ...
MLflow deployments Python API
Create model
Integrating with Ray Serve is easy.
• Ray Serve endpoints can be called from Python.
• Clean conceptual separation:
• Ray Serve handles data plane (processing)
• MLflow handles control plane (metadata, configuration)
Demo: An ML Platform built with MLflow and Ray
Acknowledgements
Thanks to Jules Damji, Sid Murching, and Paul Ogilvie for
their help and guidance with MLflow.
Thanks to Dmitri Gekhtman, Kai Fricke, Simon Mo,
Edward Oakes, Richard Liaw, Kathryn Zhou and the rest
of the Ray team!
Feedback
Your feedback is important to us.
Don’t forget to rate and review the sessions.

Mais conteúdo relacionado

Mais procurados

Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Managing the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflowManaging the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflow
Databricks
 
MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle
Databricks
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
Databricks
 
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
confluent
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
Databricks
 

Mais procurados (20)

Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)
 
Machine Learning Operations & Azure
Machine Learning Operations & AzureMachine Learning Operations & Azure
Machine Learning Operations & Azure
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Pythonsevilla2019 - Introduction to MLFlow
Pythonsevilla2019 - Introduction to MLFlowPythonsevilla2019 - Introduction to MLFlow
Pythonsevilla2019 - Introduction to MLFlow
 
Managing the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflowManaging the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflow
 
MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle
 
XStream: stream processing platform at facebook
XStream:  stream processing platform at facebookXStream:  stream processing platform at facebook
XStream: stream processing platform at facebook
 
Training And Serving ML Model Using Kubeflow by Jayesh Sharma
Training And Serving ML Model Using Kubeflow by Jayesh SharmaTraining And Serving ML Model Using Kubeflow by Jayesh Sharma
Training And Serving ML Model Using Kubeflow by Jayesh Sharma
 
Productionizing Machine Learning with a Microservices Architecture
Productionizing Machine Learning with a Microservices ArchitectureProductionizing Machine Learning with a Microservices Architecture
Productionizing Machine Learning with a Microservices Architecture
 
Seldon: Deploying Models at Scale
Seldon: Deploying Models at ScaleSeldon: Deploying Models at Scale
Seldon: Deploying Models at Scale
 
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAdvanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
 
Kubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOKubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPO
 
MLflow: A Platform for Production Machine Learning
MLflow: A Platform for Production Machine LearningMLflow: A Platform for Production Machine Learning
MLflow: A Platform for Production Machine Learning
 
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
 
NiFi Best Practices for the Enterprise
NiFi Best Practices for the EnterpriseNiFi Best Practices for the Enterprise
NiFi Best Practices for the Enterprise
 
"Managing the Complete Machine Learning Lifecycle with MLflow"
"Managing the Complete Machine Learning Lifecycle with MLflow""Managing the Complete Machine Learning Lifecycle with MLflow"
"Managing the Complete Machine Learning Lifecycle with MLflow"
 
Kafka Connect - debezium
Kafka Connect - debeziumKafka Connect - debezium
Kafka Connect - debezium
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 

Semelhante a Building an ML Platform with Ray and MLflow

Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflowImproving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Databricks
 
D Trace Support In My Sql Guide To Solving Reallife Performance Problems
D Trace Support In My Sql Guide To Solving Reallife Performance ProblemsD Trace Support In My Sql Guide To Solving Reallife Performance Problems
D Trace Support In My Sql Guide To Solving Reallife Performance Problems
MySQLConference
 
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Databricks
 

Semelhante a Building an ML Platform with Ray and MLflow (20)

Ray and Its Growing Ecosystem
Ray and Its Growing EcosystemRay and Its Growing Ecosystem
Ray and Its Growing Ecosystem
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
 
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflowImproving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
 
slide-keras-tf.pptx
slide-keras-tf.pptxslide-keras-tf.pptx
slide-keras-tf.pptx
 
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
 
Python and Oracle : allies for best of data management
Python and Oracle : allies for best of data managementPython and Oracle : allies for best of data management
Python and Oracle : allies for best of data management
 
ProgrammingPrimerAndOOPS
ProgrammingPrimerAndOOPSProgrammingPrimerAndOOPS
ProgrammingPrimerAndOOPS
 
Rails Tips and Best Practices
Rails Tips and Best PracticesRails Tips and Best Practices
Rails Tips and Best Practices
 
ACM Sunnyvale Meetup.pdf
ACM Sunnyvale Meetup.pdfACM Sunnyvale Meetup.pdf
ACM Sunnyvale Meetup.pdf
 
Flux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / PipelineFlux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / Pipeline
 
Database programming
Database programmingDatabase programming
Database programming
 
Viktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning ServiceViktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning Service
 
S1 DML Syntax and Invocation
S1 DML Syntax and InvocationS1 DML Syntax and Invocation
S1 DML Syntax and Invocation
 
DML Syntax and Invocation process
DML Syntax and Invocation processDML Syntax and Invocation process
DML Syntax and Invocation process
 
Eclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science ProjectEclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science Project
 
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
 
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...
 
D Trace Support In My Sql Guide To Solving Reallife Performance Problems
D Trace Support In My Sql Guide To Solving Reallife Performance ProblemsD Trace Support In My Sql Guide To Solving Reallife Performance Problems
D Trace Support In My Sql Guide To Solving Reallife Performance Problems
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
 
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
 

Mais de Databricks

Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Jeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and QualityJeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and Quality
Databricks
 

Mais de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack Detection
 
Jeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and QualityJeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and Quality
 

Último

Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
gajnagarg
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
gajnagarg
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
amitlee9823
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
gajnagarg
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
gajnagarg
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
gajnagarg
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 

Último (20)

Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 

Building an ML Platform with Ray and MLflow

  • 1. Building an ML Platform with Ray and MLflow Amog Kamsetty and Archit Kulkarni Ray Team @ Anyscale
  • 2. The Team Archit Kulkarni Amog Kamsetty Dmitri Gekhtman Edward Oakes Richard Liaw Kai Fricke Simon Mo Kathryn Zhou
  • 3. Overview of Talk ▪ What are ML Platforms? ▪ Ray and its libraries ▪ MLflow ▪ Demo: An ML Platform built with MLflow and Ray
  • 4. What are ML Platforms?
  • 6. Execution - Feature engineering - Training - Including tuning - Serving - Offline scoring, inference - Online serving Typical ML Process -- Simplified Management - Tracking - Data, Code, Configurations - Reproducing Results - Deployment - Deploy in a variety of environments
  • 7. Challenges with the ML Process Data/Features • Data Preparation • Data Analysis • Feature Engineering • Data Pipeline • Data Management/Feat ure Store • Manages big data clusters Model • ML Expertise • Implement SOTA ML Research • Experimentation • Manage GPU infrastructure • Scalable training & hyperparameter tuning Production • A/B Testing • Model Evaluation • Analysis of Predictions • Deploy in variety of environments • CI/CD • Highly Available prediction service Data/Research Scientist Engineers
  • 8. Challenges with the ML Process Data • Data Preparation • Data Analysis • Feature Engineering • Data Pipeline • Data Management/Feat ure Store • Manages big data clusters Model • ML Expertise • Implement SOTA ML Research • Experimentation • Manage GPU infrastructure • Scalable training & hyperparameter tuning Production • A/B Testing • Model Evaluation • Analysis of Predictions • Deploy in variety of environments • CI/CD • Highly Available prediction service Data/Research Scientist Software/Data/ ML Engineer ML Platform Abstraction
  • 9. ML Platforms -- Scale - LinkedIn: - 500+ “AI engineers” building models; 50+ MLP engineers - > 50% offline compute demand (12K servers each with 256G RAM) - More than 2x a year - Uber Michelangelo, AirBnB Bighead, Facebook FBLearner, etc. - Globally, a few Billion $ now, growing 40%+ YoY - Many companies building ML Platforms from the ground up
  • 10. ML Platforms -- Landscape (Source: Intel Capital)
  • 11. ML Platforms -- Landscape (Source: Intel Capital)
  • 12. Execution - Feature engineering 🔪 - Training 🍳 - Including tuning 🧂 - Serving 🍽 - Offline scoring, inference - Online serving Typical ML Process -- Simplified Management - Tracking 📝 - Data, Code, Configurations - Reproducing Results 📖 - Deployment 🚚 💻 - Deploy in a variety of environments
  • 13. Execution - Feature engineering 🔪 - Training 🍳 - Including tuning 🧂 - Serving 🍽 - Offline scoring, inference - Online serving Typical ML Process -- Simplified Management - Tracking 📝 - Data, Code, Configurations - Reproducing Results 📖 - Deployment 🚚 💻 - Variety of environments
  • 14. Ray and its Libraries
  • 15. What is Ray? • A simple/general library for distributed computing • Single machine or 100s of nodes • Agnostic to the type of work • An ecosystem of libraries (for scaling ML and more) • Native: Ray RLlib, Ray Tune, Ray Serve • Third party: Modin, Dask, Horovod, XGBoost, Pytorch Lightning • Tools for launching clusters on any cloud provider
  • 16. Three key ideas Execute remote functions as tasks, and instantiate remote classes as actors • Support both stateful and stateless computations Asynchronous execution using futures • Enable parallelism Distributed (immutable) object store • Efficient communication (send arguments by reference)
  • 18. API Functions -> Tasks def read_array(file): # read array “a” from “file” return a def add(a, b): return np.add(a, b)
  • 19. API Functions -> Tasks @ray.remote def read_array(file): # read array “a” from “file” return a @ray.remote def add(a, b): return np.add(a, b)
  • 20. API Functions -> Tasks @ray.remote def read_array(file): # read array “a” from “file” return a @ray.remote def add(a, b): return np.add(a, b) id1 = read_array.remote(“/input1”) id1 read_array
  • 21. API Functions -> Tasks @ray.remote def read_array(file): # read array “a” from “file” return a @ray.remote def add(a, b): return np.add(a, b) id1 = read_array.remote(“/input1”) id2 = read_array.remote(“/input2”) id1 read_array id2 zeros read_array
  • 22. API Functions -> Tasks @ray.remote def read_array(file): # read array “a” from “file” return a @ray.remote def add(a, b): return np.add(a, b) id1 = read_array.remote(“/input1”) id2 = read_array.remote(“/input2”) id3 = add.remote(id1, id2) id1 read_array id2 zeros read_array id3 add
  • 23. API Functions -> Tasks @ray.remote def read_array(file): # read array “a” from “file” return a @ray.remote def add(a, b): return np.add(a, b) id1 = read_array.remote(“/input1”) id2 = read_array.remote(“/input2”) id3 = add.remote(id1, id2); ray.get(id3) id1 read_array id2 zeros read_array id3 add
  • 24. API Functions -> Tasks @ray.remote def read_array(file): # read array “a” from “file” return a @ray.remote def add(a, b): return np.add(a, b) id1 = read_array.remote(“/input1”) id2 = read_array.remote(“/input2”) id3 = add.remote(id1, id2) Classes -> Actors
  • 25. API Functions -> Tasks @ray.remote def read_array(file): # read array “a” from “file” return a @ray.remote def add(a, b): return np.add(a, b) id1 = read_array.remote(“/input1”) id2 = read_array.remote(“/input2”) id3 = add.remote(id1, id2) Classes -> Actors @ray.remote class Counter(object): def __init__(self): self.value = 0 def inc(self): self.value += 1 return self.value
  • 26. API Functions -> Tasks @ray.remote def read_array(file): # read array “a” from “file” return a @ray.remote def add(a, b): return np.add(a, b) id1 = read_array.remote(“/input1”) id2 = read_array.remote(“/input2”) id3 = add.remote(id1, id2) Classes -> Actors @ray.remote class Counter(object): def __init__(self): self.value = 0 def inc(self): self.value += 1 return self.value c = Counter.remote() id4 = c.inc.remote() id5 = c.inc.remote() ray.get([id4, id5])
  • 27. API Functions -> Tasks @ray.remote def read_array(file): # read array “a” from “file” return a @ray.remote(num_gpus=1) def add(a, b): return np.add(a, b) id1 = read_array.remote(“/input1”) id2 = read_array.remote(“/input2”) id3 = add.remote(id1, id2) Classes -> Actors @ray.remote(num_gpus=1) class Counter(object): def __init__(self): self.value = 0 def inc(self): self.value += 1 return self.value c = Counter.remote() id4 = c.inc.remote() id5 = c.inc.remote() ray.get([id4, id5])
  • 28. at Anyscale Your app here! Native Libraries 3rd Party Libraries Ecosystem Universal framework for Distributed computing Ray Ecosystem
  • 30. Ray Tune: Scalable Hyperparameter Tuning Wide variety of algorithms Compatible with ML frameworks HYPERBAND PBT BAYESIAN OPT.
  • 31. Ray Tune focuses on simplifying execution Easily launch distributed multi-gpu tuning jobs Automatic fault tolerance to save 3x on GPU costs https://www.vecteezy.com/ $ ray up {cluster config} ray.init(address="auto") tune.run(func, num_samples=100)
  • 32. Ray Tune interoperates with other HPO libraries Ray Tune Ax Optuna scikit-optimize …
  • 33. def train_model(config={}): model = ConvNet(config) for i in range(steps): current_loss = model.train()
  • 34. from ray import tune def train_model(config={}): model = ConvNet(config) for i in range(steps): current_loss = model.train() tune.report(loss=current_loss)
  • 35. def train_model(config): model = ConvNet(config) for i in range(epochs): current_loss = model.train() tune.report(loss=current_loss) tune.run(train_model, config={“lr”: 0.1})
  • 36. tune.run( train_model, config={“lr”: tune.uniform(0.001, 0.1)}, num_samples=100 ) def train_model(config): model = ConvNet(config) for i in range(epochs): current_loss = model.train() tune.report(loss=current_loss)
  • 37. tune.run( train_model, config={“lr”: tune.uniform(0.001, 0.1)}, num_samples=100, scheduler=ASHAScheduler()) def train_model(config): model = ConvNet(config) for i in range(epochs): current_loss = model.train() tune.report(loss=current_loss)
  • 38. tune.run( train_model, config={“lr”: tune.uniform(0.001, 0.1)}, num_samples=100, scheduler=PopulationBasedTraining(...)) def train_model(config, checkpoint_dir=None): model = ConvNet(config) if checkpoint_dir is not None: model.load_checkpoint(checkpoint_dir+”model.pt”) for i in range(epochs): current_loss = model.train() with tune.checkpoint_dir() as dir: model.save_checkpoint(dir+”model.pt”) tune.report(loss=current_loss)
  • 40. Ray Serve is a Web Framework Built for Model Serving
  • 42. Ray Serve is high-performance and flexible • Framework-agnostic • Easily scales • Supports batching • Query your endpoints from HTTP and from Python • Easily integrate with other tools
  • 43. Ray Serve is built on top of Ray For user, no need to think about: • Interprocess communication • Failure management • Scheduling Just tell Ray Serve to scale up your model.
  • 44. Serve functions and stateful classes. Ray Serve will use multiple replicas to parallelize across cores and across nodes in your cluster. Ray Serve API
  • 45. Flexibility Query your model from HTTP: > curl "http://127.0.0.1:8000/my/route" Or query from Python using ServeHandle:
  • 47. Challenges of ML in production • It’s difficult to keep track of experiments. • It’s difficult to reproduce code. • There’s no standard way to package and deploy models. • There’s no central store to manage models (their versions and stage transitions). Source: mlflow.org
  • 48. What is MLflow? • Open-source ML lifecycle management tool • Single solution for all of the above challenges • Library-agnostic and language-agnostic • (Works with your existing code)
  • 49. Four key functions of MLflow Source: MLflow
  • 53. Ray Tune + MLflow Tracking def train_model(config): model = ConvNet(config) for i in range(epochs): current_loss = model.train() tune.report(loss=current_loss) tune.run( train_model, config={“lr”: tune.uniform(0.001, 0.1)}, num_samples=100, callbacks=[MLflowLoggerCallback(“my_experiment”)])
  • 54. Ray Tune + MLflow Tracking @mlflow_mixin def train_model(config): mlflow.autolog() xgboost_results = xgb.train(config, ...) tune.run( train_model, config={“lr”: tune.uniform(0.001, 0.1)}, num_samples=100)
  • 55. + > pip install mlflow-ray-serve > ray start --head > serve start
  • 56. MLflow deployments CLI Create deployment > mlflow deployments create -t ray-serve -m <model URI> --name my_model -C num_replicas=100 Model URI: • models:/MyModel/1 • runs:/93203689db9c4b50afb6869 • s3://<bucket>/<path> • ...
  • 57. MLflow deployments Python API Create model
  • 58. Integrating with Ray Serve is easy. • Ray Serve endpoints can be called from Python. • Clean conceptual separation: • Ray Serve handles data plane (processing) • MLflow handles control plane (metadata, configuration)
  • 59. Demo: An ML Platform built with MLflow and Ray
  • 60. Acknowledgements Thanks to Jules Damji, Sid Murching, and Paul Ogilvie for their help and guidance with MLflow. Thanks to Dmitri Gekhtman, Kai Fricke, Simon Mo, Edward Oakes, Richard Liaw, Kathryn Zhou and the rest of the Ray team!
  • 61.
  • 62. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.