SlideShare uma empresa Scribd logo
1 de 31
Introduction to Polyaxon
Yu Ishikawa
Agenda
- Why do we need Polyaxon?
- What is Polyaxon?
- How does Polyaxon work?
- Demo
- Summary
Objectives to introduce Polyaxon
- Make the lead time of experiments as short as possible.
- Make the financial cost to train models as cheap as possible.
- Make the experiments reproducible.
Experiment Phase Operating Phase
Problem
Setting
Collecting
Data
Experiment
s
Off-line
Evaluation
Serving
Models
On-line
Evaluation
Retrain
Model
Off-line
Evaluation
Productionize
ML system
Productionize PhaseML
Workflow
Polyaxon’s role
Why do we need Polyaxon?
- We are not able to manage experiments as team today.
- The cost of experiments is expensive in terms of the financial cost and time. There is room to
improve the efficiency and productivity.
- Setting experiment environments can be tough for ML engineers. Moreover, the environments tend
not to be reproducible. Taking over other member’s tasks can be expensive. As well as, we can not
manage the training process as team.
- It takes a long time for python ML libraries like sklearn to do hyperparameter search, since python is
basically not good at scalability.
Agenda
- Why do we need Polyaxon?
- What is Poyaxon?
- How does Polyaxon work?
- Demo
- Summary
What is Polyaxon?
- An open source platform for reproducible machine learning at scale.
- https://polyaxon.com/
- Features
- Notebook
- Hyperparameter search
- Powerful workspace
- User management
- Dynamic resources allocation
- Dashboard
- Versioning
Notebook environment with Jupyter
---
version: 1
kind: notebook
build:
image: tensorflow/tensorflow:1.4.1-py3
build_steps:
- pip3 install jupyter
$ Polyaxon notebook start -f polyaxon_notebook.yml
New version of CLI (0.3.5) is now available. To upgrade run:
pip install -U polyaxon-cli
Notebook is being deployed for project `quick-start`
It may take some time before you can access the notebook.
Your notebook will be available on:
http://35.184.217.84:80/notebook/root/quick-start/
- Polyaxon enables us to launch a jupyter environment with one command. As
well as we can define the environment with docker and some commands in a
YAML file.
- We can reproduce the notebook experiments easily.
Hyperparameter tuning with Polyaxon
- Polyaxon supports some hyperparameter tuning methods:
- Grid search
- Random search
- Bayesian optimization
- Early stopping
- Hyperband
- We can control the concurrency of hyperparameter tuning with YAML file.
- We can reproduce hyperparameter tuning jobs as well.
Hyperparameter tuning with high concurrency
---
version: 1
kind: group
hptuning:
concurrency: 5
matrix:
learning_rate:
linspace: 0.001:0.1:5
dropout:
values: [0.25, 0.3]
activation:
values: [relu, sigmoid]
declarations:
batch_size: 128
num_steps: 500
num_epochs: 1
build:
image: tensorflow/tensorflow:1.4.1-py3
build_steps:
- pip3 install --no-cache-dir -U polyaxon-helper
run:
cmd: python3 model.py --batch_size={{ batch_size }} 
--num_steps={{ num_steps }} 
--learning_rate={{ learning_rate }} 
--dropout={{ dropout }} 
--num_epochs={{ num_epochs }} 
--activation={{ activation }}
$ Polyaxon notebook start -f
polyaxon_notebook.yml
New version of CLI (0.3.5) is now available. To
upgrade run:
pip install -U polyaxon-cli
Creating an experiment group with the following
definition:
---------------- -----------------
Search algorithm grid
Concurrency 5 concurrent runs
Early stopping deactivated
---------------- -----------------
Experiment group 1 was created
polyaxon_gridsearch.yml
Dashboard ~ Experiments
Dashboard ~ Metrics Visualization
Agenda
- Why do we need Polyaxon?
- What is Polyaxon?
- How does Polyaxon work?
- Demo
- Summary
Hyperparameter tuning with a single machine
Many CPU cores machine
Memory
data
CPU
Core: Train with parameter set A
Core: Train with parameter set B
Core: Train with parameter set C
Core: Train with parameter set X
...
For instance, scikit-learn’s GridSearchCV enables us to run experiments in parallel. However, the number
of process is based on the number of CPU cores. For instance, a 64 CPU cores machine can have up to
64 concurrencies.
Hyperparameter tuning of Polyaxon
Polyaxon on k8s
Polyaxon core
Node A
Node B
Node C
Training Code
Upload & run
Build
Pod: parameter set A
Pod: parameter set B
Pod: parameter set C
Pod: parameter set D
Pod: parameter set E
Pod: parameter set F
Schedule
The more the number of nodes in k8s cluster, the more the number of process to
train is. There is no constraint of parallelism.
Even one experiments can be shorter
Experiments Evaluation
Experiments Evaluation
By leveraging the multiple nodes on k8s, we can shorten the experiment time of 1
experiment with high concurrency.
Single machine
Polyaxon
t
Reduce training time
Auto-scalable & preemptible node pool with polyaxon
Polyaxon on k8s
Node pool for
polyaxon core
Node Node Node
Node pool for
experiments
Auto-scalable & preemptible node pool with polyaxon
Polyaxon on k8s
Node pool for
polyaxon core
Node Node Node
Node pool for
experiments
Training Code
of experiment X
Upload & run
Concurrency:
- 100
Requests:
- CPU: 1
- Memory: 2GB
Auto-scalable & preemptible node pool with polyaxon
Polyaxon on k8s
Node pool for
polyaxon core
Node Node Node
Node pool for
experiments
Preemptible node Preemptible node Preemptible node
Training Code
of experiment X
Upload & run
Concurrency:
- 100
Requests:
- CPU: 1
- Memory: 2GB
Automatically launch new preemptible instances
Auto-scalable & preemptible node pool with polyaxon
Polyaxon on k8s
Node pool for
polyaxon core
Node Node Node
Node pool for
experiments
Training Code
of experiment Y
Upload & run
Concurrency:
- 50
Requests:
- CPU: 1
- Memory: 1GB
Preemptible node Preemptible node Preemptible node
Auto-scalable & preemptible node pool with polyaxon
Polyaxon on k8s
Node pool for
polyaxon core
Node Node Node
Node pool for
experiments
Preemptible node Preemptible node
Preemptible node Preemptible node
Preemptible node
Training Code
of experiment Y
Upload & run
Concurrency:
- 50
Requests:
- CPU: 1
- Memory: 1GB
Preemptible instance/GPU/TPU pricing
Preemptible instance Preemptible GPU
Preemptible TPU
Regular instance cost vs Preemptible instance cost
Running cost
t
Regular instance
Preemptible instance
- We can reduce the cost of training models by
leveraging preemptible instances with
polyaxon.
- Polyaxon enables us to use preemptible node
pool for experiments.
- Since polyaxon automatically scale the node
pool with GKE, we don’t need to hold static
instances for experiments.
Reduced
cost
It takes a longer time to do experiments with a single machine sequentially,
because python ML library like sklearn is not basically scalable.
Multiple Experiments with a single machine
Experiments Evaluation
Experiments Evaluation
Experiments Evaluation
t
Polyaxon enables us to easily run multiple experiments in parallel on k8s. We
don’t need to wait for each experiments to move on to the next one.
Multiple Experiments with polyaxon
Experiments Evaluation
Experiments Evaluation
Experiments Evaluation
We can shorten the total experiments time by
the parallelism of Polyaxon.
t
Sequential experiments cost vs parallel experiments cost
Running cost
t
- Essentially speaking, the costs of instances
should be the same, since the cost of CPU
usage is linear with running time.
- However, we should not overlook labor costs
while experiments. Waiting for experiments is
time and money wasting. Time is money!!
- We can reduce the total cost by shortening the
total experiments time.
Sequential experiments
Experiments in parallel
Labor cost
t
Reduced cost
Power of multiple preemptible nodes
- The cost of preemptible n1-standard-64 x 10 nodes x 2 hours with 640
concurrencies. It should be cheap!
- $12.8 = $0.64 * 10 * 2
- Even If it takes about 20 minutes to run 1 parameter set of a training job, we
can run about 3840 parameter sets of a training job for just 2 hours with such
a cheap cost.
We can achieve the objectives:
- Make the lead time of experiments as short as possible.
- Make the financial cost to train models as cheap as possible.
Agenda
- Why do we need Polyaxon?
- What is Polyaxon?
- How does Polyaxon work?
- Demo
- Summary
Demo
- Notebook
- Job
- Experiment
- Hyperparameter tuning at scale
Summary
- We can definitely achieve the objectives with polyaxon on GKE.
- Make the lead time of experiments as short as possible.
- Make the financial cost to train models as cheap as possible.
- Make the experiments reproducible.
- All we ML engineers have to do is:
- Making the training code in python as usual, and
- Defining the YAML files to do experiments.
- What’s next?
- Supporting preemptible GPUs / TPUs.
Appendix A: Links
- Polyaxon
- https://polyaxon.com/
- Documentation
- https://docs.polyaxon.com/
- Examples
- https://github.com/polyaxon/polyaxon-quick-start
- https://github.com/polyaxon/deep-learning-with-python-notebooks-on-polyaxon
- https://github.com/polyaxon/polyaxon-examples
Appendix B: Architecture of Polyaxon

Mais conteúdo relacionado

Mais procurados

Seamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflowSeamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflowDatabricks
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleMLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleDatabricks
 
The Future of Asset Management: Building Business Models and Strategies for 2025
The Future of Asset Management: Building Business Models and Strategies for 2025The Future of Asset Management: Building Business Models and Strategies for 2025
The Future of Asset Management: Building Business Models and Strategies for 2025accenture
 
ML-Ops how to bring your data science to production
ML-Ops  how to bring your data science to productionML-Ops  how to bring your data science to production
ML-Ops how to bring your data science to productionHerman Wu
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLMLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLJordan Birdsell
 
Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Adrien Blind
 
Communications Technology Vision 2021
Communications Technology Vision 2021Communications Technology Vision 2021
Communications Technology Vision 2021accenture
 
Mlflow with databricks
Mlflow with databricksMlflow with databricks
Mlflow with databricksLiangjun Jiang
 
Running Geospatial Workloads on AWS - AWS Summit Sydney
Running Geospatial Workloads on AWS - AWS Summit SydneyRunning Geospatial Workloads on AWS - AWS Summit Sydney
Running Geospatial Workloads on AWS - AWS Summit SydneyAmazon Web Services
 
"Managing the Complete Machine Learning Lifecycle with MLflow"
"Managing the Complete Machine Learning Lifecycle with MLflow""Managing the Complete Machine Learning Lifecycle with MLflow"
"Managing the Complete Machine Learning Lifecycle with MLflow"Databricks
 
MLflow and Azure Machine Learning—The Power Couple for ML Lifecycle Management
MLflow and Azure Machine Learning—The Power Couple for ML Lifecycle ManagementMLflow and Azure Machine Learning—The Power Couple for ML Lifecycle Management
MLflow and Azure Machine Learning—The Power Couple for ML Lifecycle ManagementDatabricks
 
MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.Knoldus Inc.
 
Introduction to MLflow
Introduction to MLflowIntroduction to MLflow
Introduction to MLflowDatabricks
 
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and PrometheusRobust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and PrometheusManasi Vartak
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 
Productionizing Machine Learning with a Microservices Architecture
Productionizing Machine Learning with a Microservices ArchitectureProductionizing Machine Learning with a Microservices Architecture
Productionizing Machine Learning with a Microservices ArchitectureDatabricks
 
[DSC Adria 23] Mikhail Rozhkov DVC in Machine Learning Engineering and MLOps ...
[DSC Adria 23] Mikhail Rozhkov DVC in Machine Learning Engineering and MLOps ...[DSC Adria 23] Mikhail Rozhkov DVC in Machine Learning Engineering and MLOps ...
[DSC Adria 23] Mikhail Rozhkov DVC in Machine Learning Engineering and MLOps ...DataScienceConferenc1
 

Mais procurados (20)

Seamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflowSeamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflow
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleMLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
 
The Future of Asset Management: Building Business Models and Strategies for 2025
The Future of Asset Management: Building Business Models and Strategies for 2025The Future of Asset Management: Building Business Models and Strategies for 2025
The Future of Asset Management: Building Business Models and Strategies for 2025
 
MLOps with Kubeflow
MLOps with Kubeflow MLOps with Kubeflow
MLOps with Kubeflow
 
ML-Ops how to bring your data science to production
ML-Ops  how to bring your data science to productionML-Ops  how to bring your data science to production
ML-Ops how to bring your data science to production
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLMLOps - The Assembly Line of ML
MLOps - The Assembly Line of ML
 
Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)
 
Communications Technology Vision 2021
Communications Technology Vision 2021Communications Technology Vision 2021
Communications Technology Vision 2021
 
Mlflow with databricks
Mlflow with databricksMlflow with databricks
Mlflow with databricks
 
Running Geospatial Workloads on AWS - AWS Summit Sydney
Running Geospatial Workloads on AWS - AWS Summit SydneyRunning Geospatial Workloads on AWS - AWS Summit Sydney
Running Geospatial Workloads on AWS - AWS Summit Sydney
 
Machine Learning Operations & Azure
Machine Learning Operations & AzureMachine Learning Operations & Azure
Machine Learning Operations & Azure
 
"Managing the Complete Machine Learning Lifecycle with MLflow"
"Managing the Complete Machine Learning Lifecycle with MLflow""Managing the Complete Machine Learning Lifecycle with MLflow"
"Managing the Complete Machine Learning Lifecycle with MLflow"
 
IBM Cloud pak for data brochure
IBM Cloud pak for data   brochureIBM Cloud pak for data   brochure
IBM Cloud pak for data brochure
 
MLflow and Azure Machine Learning—The Power Couple for ML Lifecycle Management
MLflow and Azure Machine Learning—The Power Couple for ML Lifecycle ManagementMLflow and Azure Machine Learning—The Power Couple for ML Lifecycle Management
MLflow and Azure Machine Learning—The Power Couple for ML Lifecycle Management
 
MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.
 
Introduction to MLflow
Introduction to MLflowIntroduction to MLflow
Introduction to MLflow
 
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and PrometheusRobust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 
Productionizing Machine Learning with a Microservices Architecture
Productionizing Machine Learning with a Microservices ArchitectureProductionizing Machine Learning with a Microservices Architecture
Productionizing Machine Learning with a Microservices Architecture
 
[DSC Adria 23] Mikhail Rozhkov DVC in Machine Learning Engineering and MLOps ...
[DSC Adria 23] Mikhail Rozhkov DVC in Machine Learning Engineering and MLOps ...[DSC Adria 23] Mikhail Rozhkov DVC in Machine Learning Engineering and MLOps ...
[DSC Adria 23] Mikhail Rozhkov DVC in Machine Learning Engineering and MLOps ...
 

Semelhante a Introduction to Polyaxon

Build, train, and deploy Machine Learning models at scale (May 2018)
Build, train, and deploy Machine Learning models at scale (May 2018)Build, train, and deploy Machine Learning models at scale (May 2018)
Build, train, and deploy Machine Learning models at scale (May 2018)Julien SIMON
 
Build, train, and deploy Machine Learning models at scale (May 2018)
Build, train, and deploy Machine Learning models at scale (May 2018)Build, train, and deploy Machine Learning models at scale (May 2018)
Build, train, and deploy Machine Learning models at scale (May 2018)Julien SIMON
 
DevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDinakar Guniguntala
 
Prediction as a service with ensemble model in SparkML and Python ScikitLearn
Prediction as a service with ensemble model in SparkML and Python ScikitLearnPrediction as a service with ensemble model in SparkML and Python ScikitLearn
Prediction as a service with ensemble model in SparkML and Python ScikitLearnJosef A. Habdank
 
SigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the UntunableSigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the UntunableSigOpt
 
JVM and OS Tuning for accelerating Spark application
JVM and OS Tuning for accelerating Spark applicationJVM and OS Tuning for accelerating Spark application
JVM and OS Tuning for accelerating Spark applicationTatsuhiro Chiba
 
Spark Summit EU talk by Josef Habdank
Spark Summit EU talk by Josef HabdankSpark Summit EU talk by Josef Habdank
Spark Summit EU talk by Josef HabdankSpark Summit
 
HiPEAC 2019 Tutorial - Maestro RTOS
HiPEAC 2019 Tutorial - Maestro RTOSHiPEAC 2019 Tutorial - Maestro RTOS
HiPEAC 2019 Tutorial - Maestro RTOSTulipp. Eu
 
Resource-Efficient Deep Learning Model Selection on Apache Spark
Resource-Efficient Deep Learning Model Selection on Apache SparkResource-Efficient Deep Learning Model Selection on Apache Spark
Resource-Efficient Deep Learning Model Selection on Apache SparkDatabricks
 
Apache Submarine: Unified Machine Learning Platform
Apache Submarine: Unified Machine Learning PlatformApache Submarine: Unified Machine Learning Platform
Apache Submarine: Unified Machine Learning PlatformWangda Tan
 
StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkSri Ambati
 
Toronto meetup 20190917
Toronto meetup 20190917Toronto meetup 20190917
Toronto meetup 20190917Bill Liu
 
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsStijn Decubber
 
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsUnderstand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsIntel® Software
 
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...Chris Fregly
 
How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014Puppet
 
Sparkly Notebook: Interactive Analysis and Visualization with Spark
Sparkly Notebook: Interactive Analysis and Visualization with SparkSparkly Notebook: Interactive Analysis and Visualization with Spark
Sparkly Notebook: Interactive Analysis and Visualization with Sparkfelixcss
 
Profiling & Testing with Spark
Profiling & Testing with SparkProfiling & Testing with Spark
Profiling & Testing with SparkRoger Rafanell Mas
 
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...OpenNebula Project
 
Intro - End to end ML with Kubeflow @ SignalConf 2018
Intro - End to end ML with Kubeflow @ SignalConf 2018Intro - End to end ML with Kubeflow @ SignalConf 2018
Intro - End to end ML with Kubeflow @ SignalConf 2018Holden Karau
 

Semelhante a Introduction to Polyaxon (20)

Build, train, and deploy Machine Learning models at scale (May 2018)
Build, train, and deploy Machine Learning models at scale (May 2018)Build, train, and deploy Machine Learning models at scale (May 2018)
Build, train, and deploy Machine Learning models at scale (May 2018)
 
Build, train, and deploy Machine Learning models at scale (May 2018)
Build, train, and deploy Machine Learning models at scale (May 2018)Build, train, and deploy Machine Learning models at scale (May 2018)
Build, train, and deploy Machine Learning models at scale (May 2018)
 
DevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on Kubernetes
 
Prediction as a service with ensemble model in SparkML and Python ScikitLearn
Prediction as a service with ensemble model in SparkML and Python ScikitLearnPrediction as a service with ensemble model in SparkML and Python ScikitLearn
Prediction as a service with ensemble model in SparkML and Python ScikitLearn
 
SigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the UntunableSigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the Untunable
 
JVM and OS Tuning for accelerating Spark application
JVM and OS Tuning for accelerating Spark applicationJVM and OS Tuning for accelerating Spark application
JVM and OS Tuning for accelerating Spark application
 
Spark Summit EU talk by Josef Habdank
Spark Summit EU talk by Josef HabdankSpark Summit EU talk by Josef Habdank
Spark Summit EU talk by Josef Habdank
 
HiPEAC 2019 Tutorial - Maestro RTOS
HiPEAC 2019 Tutorial - Maestro RTOSHiPEAC 2019 Tutorial - Maestro RTOS
HiPEAC 2019 Tutorial - Maestro RTOS
 
Resource-Efficient Deep Learning Model Selection on Apache Spark
Resource-Efficient Deep Learning Model Selection on Apache SparkResource-Efficient Deep Learning Model Selection on Apache Spark
Resource-Efficient Deep Learning Model Selection on Apache Spark
 
Apache Submarine: Unified Machine Learning Platform
Apache Submarine: Unified Machine Learning PlatformApache Submarine: Unified Machine Learning Platform
Apache Submarine: Unified Machine Learning Platform
 
StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling framework
 
Toronto meetup 20190917
Toronto meetup 20190917Toronto meetup 20190917
Toronto meetup 20190917
 
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
 
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsUnderstand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
 
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
 
How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014
 
Sparkly Notebook: Interactive Analysis and Visualization with Spark
Sparkly Notebook: Interactive Analysis and Visualization with SparkSparkly Notebook: Interactive Analysis and Visualization with Spark
Sparkly Notebook: Interactive Analysis and Visualization with Spark
 
Profiling & Testing with Spark
Profiling & Testing with SparkProfiling & Testing with Spark
Profiling & Testing with Spark
 
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
 
Intro - End to end ML with Kubeflow @ SignalConf 2018
Intro - End to end ML with Kubeflow @ SignalConf 2018Intro - End to end ML with Kubeflow @ SignalConf 2018
Intro - End to end ML with Kubeflow @ SignalConf 2018
 

Mais de Yu Ishikawa

2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQL2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQLYu Ishikawa
 
2016-06-15 Sparkの機械学習の開発と活用の動向
2016-06-15 Sparkの機械学習の開発と活用の動向2016-06-15 Sparkの機械学習の開発と活用の動向
2016-06-15 Sparkの機械学習の開発と活用の動向Yu Ishikawa
 
2016-02-08 Spark MLlib Now and Beyond@Spark Conference Japan 2016
2016-02-08 Spark MLlib Now and Beyond@Spark Conference Japan 20162016-02-08 Spark MLlib Now and Beyond@Spark Conference Japan 2016
2016-02-08 Spark MLlib Now and Beyond@Spark Conference Japan 2016Yu Ishikawa
 
2015-11-17 きちんと知りたいApache Spark ~機械学習とさまざまな機能群
2015-11-17 きちんと知りたいApache Spark ~機械学習とさまざまな機能群2015-11-17 きちんと知りたいApache Spark ~機械学習とさまざまな機能群
2015-11-17 きちんと知りたいApache Spark ~機械学習とさまざまな機能群Yu Ishikawa
 
2015 03-12 道玄坂LT祭り第2回 Spark DataFrame Introduction
2015 03-12 道玄坂LT祭り第2回 Spark DataFrame Introduction2015 03-12 道玄坂LT祭り第2回 Spark DataFrame Introduction
2015 03-12 道玄坂LT祭り第2回 Spark DataFrame IntroductionYu Ishikawa
 
2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indixYu Ishikawa
 
「チーム開発実践入門」勉強会
「チーム開発実践入門」勉強会「チーム開発実践入門」勉強会
「チーム開発実践入門」勉強会Yu Ishikawa
 
BdasとSpark概要
BdasとSpark概要BdasとSpark概要
BdasとSpark概要Yu Ishikawa
 
Hadoop conference 2013winter_for_slideshare
Hadoop conference 2013winter_for_slideshareHadoop conference 2013winter_for_slideshare
Hadoop conference 2013winter_for_slideshareYu Ishikawa
 
2012 02-02 mixi engineer's seminor #3
2012 02-02  mixi engineer's seminor #32012 02-02  mixi engineer's seminor #3
2012 02-02 mixi engineer's seminor #3Yu Ishikawa
 

Mais de Yu Ishikawa (10)

2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQL2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQL
 
2016-06-15 Sparkの機械学習の開発と活用の動向
2016-06-15 Sparkの機械学習の開発と活用の動向2016-06-15 Sparkの機械学習の開発と活用の動向
2016-06-15 Sparkの機械学習の開発と活用の動向
 
2016-02-08 Spark MLlib Now and Beyond@Spark Conference Japan 2016
2016-02-08 Spark MLlib Now and Beyond@Spark Conference Japan 20162016-02-08 Spark MLlib Now and Beyond@Spark Conference Japan 2016
2016-02-08 Spark MLlib Now and Beyond@Spark Conference Japan 2016
 
2015-11-17 きちんと知りたいApache Spark ~機械学習とさまざまな機能群
2015-11-17 きちんと知りたいApache Spark ~機械学習とさまざまな機能群2015-11-17 きちんと知りたいApache Spark ~機械学習とさまざまな機能群
2015-11-17 きちんと知りたいApache Spark ~機械学習とさまざまな機能群
 
2015 03-12 道玄坂LT祭り第2回 Spark DataFrame Introduction
2015 03-12 道玄坂LT祭り第2回 Spark DataFrame Introduction2015 03-12 道玄坂LT祭り第2回 Spark DataFrame Introduction
2015 03-12 道玄坂LT祭り第2回 Spark DataFrame Introduction
 
2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix
 
「チーム開発実践入門」勉強会
「チーム開発実践入門」勉強会「チーム開発実践入門」勉強会
「チーム開発実践入門」勉強会
 
BdasとSpark概要
BdasとSpark概要BdasとSpark概要
BdasとSpark概要
 
Hadoop conference 2013winter_for_slideshare
Hadoop conference 2013winter_for_slideshareHadoop conference 2013winter_for_slideshare
Hadoop conference 2013winter_for_slideshare
 
2012 02-02 mixi engineer's seminor #3
2012 02-02  mixi engineer's seminor #32012 02-02  mixi engineer's seminor #3
2012 02-02 mixi engineer's seminor #3
 

Último

Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 

Último (20)

Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 

Introduction to Polyaxon

  • 2. Agenda - Why do we need Polyaxon? - What is Polyaxon? - How does Polyaxon work? - Demo - Summary
  • 3. Objectives to introduce Polyaxon - Make the lead time of experiments as short as possible. - Make the financial cost to train models as cheap as possible. - Make the experiments reproducible. Experiment Phase Operating Phase Problem Setting Collecting Data Experiment s Off-line Evaluation Serving Models On-line Evaluation Retrain Model Off-line Evaluation Productionize ML system Productionize PhaseML Workflow Polyaxon’s role
  • 4. Why do we need Polyaxon? - We are not able to manage experiments as team today. - The cost of experiments is expensive in terms of the financial cost and time. There is room to improve the efficiency and productivity. - Setting experiment environments can be tough for ML engineers. Moreover, the environments tend not to be reproducible. Taking over other member’s tasks can be expensive. As well as, we can not manage the training process as team. - It takes a long time for python ML libraries like sklearn to do hyperparameter search, since python is basically not good at scalability.
  • 5. Agenda - Why do we need Polyaxon? - What is Poyaxon? - How does Polyaxon work? - Demo - Summary
  • 6. What is Polyaxon? - An open source platform for reproducible machine learning at scale. - https://polyaxon.com/ - Features - Notebook - Hyperparameter search - Powerful workspace - User management - Dynamic resources allocation - Dashboard - Versioning
  • 7. Notebook environment with Jupyter --- version: 1 kind: notebook build: image: tensorflow/tensorflow:1.4.1-py3 build_steps: - pip3 install jupyter $ Polyaxon notebook start -f polyaxon_notebook.yml New version of CLI (0.3.5) is now available. To upgrade run: pip install -U polyaxon-cli Notebook is being deployed for project `quick-start` It may take some time before you can access the notebook. Your notebook will be available on: http://35.184.217.84:80/notebook/root/quick-start/ - Polyaxon enables us to launch a jupyter environment with one command. As well as we can define the environment with docker and some commands in a YAML file. - We can reproduce the notebook experiments easily.
  • 8. Hyperparameter tuning with Polyaxon - Polyaxon supports some hyperparameter tuning methods: - Grid search - Random search - Bayesian optimization - Early stopping - Hyperband - We can control the concurrency of hyperparameter tuning with YAML file. - We can reproduce hyperparameter tuning jobs as well.
  • 9. Hyperparameter tuning with high concurrency --- version: 1 kind: group hptuning: concurrency: 5 matrix: learning_rate: linspace: 0.001:0.1:5 dropout: values: [0.25, 0.3] activation: values: [relu, sigmoid] declarations: batch_size: 128 num_steps: 500 num_epochs: 1 build: image: tensorflow/tensorflow:1.4.1-py3 build_steps: - pip3 install --no-cache-dir -U polyaxon-helper run: cmd: python3 model.py --batch_size={{ batch_size }} --num_steps={{ num_steps }} --learning_rate={{ learning_rate }} --dropout={{ dropout }} --num_epochs={{ num_epochs }} --activation={{ activation }} $ Polyaxon notebook start -f polyaxon_notebook.yml New version of CLI (0.3.5) is now available. To upgrade run: pip install -U polyaxon-cli Creating an experiment group with the following definition: ---------------- ----------------- Search algorithm grid Concurrency 5 concurrent runs Early stopping deactivated ---------------- ----------------- Experiment group 1 was created polyaxon_gridsearch.yml
  • 11. Dashboard ~ Metrics Visualization
  • 12. Agenda - Why do we need Polyaxon? - What is Polyaxon? - How does Polyaxon work? - Demo - Summary
  • 13. Hyperparameter tuning with a single machine Many CPU cores machine Memory data CPU Core: Train with parameter set A Core: Train with parameter set B Core: Train with parameter set C Core: Train with parameter set X ... For instance, scikit-learn’s GridSearchCV enables us to run experiments in parallel. However, the number of process is based on the number of CPU cores. For instance, a 64 CPU cores machine can have up to 64 concurrencies.
  • 14. Hyperparameter tuning of Polyaxon Polyaxon on k8s Polyaxon core Node A Node B Node C Training Code Upload & run Build Pod: parameter set A Pod: parameter set B Pod: parameter set C Pod: parameter set D Pod: parameter set E Pod: parameter set F Schedule The more the number of nodes in k8s cluster, the more the number of process to train is. There is no constraint of parallelism.
  • 15. Even one experiments can be shorter Experiments Evaluation Experiments Evaluation By leveraging the multiple nodes on k8s, we can shorten the experiment time of 1 experiment with high concurrency. Single machine Polyaxon t Reduce training time
  • 16. Auto-scalable & preemptible node pool with polyaxon Polyaxon on k8s Node pool for polyaxon core Node Node Node Node pool for experiments
  • 17. Auto-scalable & preemptible node pool with polyaxon Polyaxon on k8s Node pool for polyaxon core Node Node Node Node pool for experiments Training Code of experiment X Upload & run Concurrency: - 100 Requests: - CPU: 1 - Memory: 2GB
  • 18. Auto-scalable & preemptible node pool with polyaxon Polyaxon on k8s Node pool for polyaxon core Node Node Node Node pool for experiments Preemptible node Preemptible node Preemptible node Training Code of experiment X Upload & run Concurrency: - 100 Requests: - CPU: 1 - Memory: 2GB Automatically launch new preemptible instances
  • 19. Auto-scalable & preemptible node pool with polyaxon Polyaxon on k8s Node pool for polyaxon core Node Node Node Node pool for experiments Training Code of experiment Y Upload & run Concurrency: - 50 Requests: - CPU: 1 - Memory: 1GB Preemptible node Preemptible node Preemptible node
  • 20. Auto-scalable & preemptible node pool with polyaxon Polyaxon on k8s Node pool for polyaxon core Node Node Node Node pool for experiments Preemptible node Preemptible node Preemptible node Preemptible node Preemptible node Training Code of experiment Y Upload & run Concurrency: - 50 Requests: - CPU: 1 - Memory: 1GB
  • 21. Preemptible instance/GPU/TPU pricing Preemptible instance Preemptible GPU Preemptible TPU
  • 22. Regular instance cost vs Preemptible instance cost Running cost t Regular instance Preemptible instance - We can reduce the cost of training models by leveraging preemptible instances with polyaxon. - Polyaxon enables us to use preemptible node pool for experiments. - Since polyaxon automatically scale the node pool with GKE, we don’t need to hold static instances for experiments. Reduced cost
  • 23. It takes a longer time to do experiments with a single machine sequentially, because python ML library like sklearn is not basically scalable. Multiple Experiments with a single machine Experiments Evaluation Experiments Evaluation Experiments Evaluation t
  • 24. Polyaxon enables us to easily run multiple experiments in parallel on k8s. We don’t need to wait for each experiments to move on to the next one. Multiple Experiments with polyaxon Experiments Evaluation Experiments Evaluation Experiments Evaluation We can shorten the total experiments time by the parallelism of Polyaxon. t
  • 25. Sequential experiments cost vs parallel experiments cost Running cost t - Essentially speaking, the costs of instances should be the same, since the cost of CPU usage is linear with running time. - However, we should not overlook labor costs while experiments. Waiting for experiments is time and money wasting. Time is money!! - We can reduce the total cost by shortening the total experiments time. Sequential experiments Experiments in parallel Labor cost t Reduced cost
  • 26. Power of multiple preemptible nodes - The cost of preemptible n1-standard-64 x 10 nodes x 2 hours with 640 concurrencies. It should be cheap! - $12.8 = $0.64 * 10 * 2 - Even If it takes about 20 minutes to run 1 parameter set of a training job, we can run about 3840 parameter sets of a training job for just 2 hours with such a cheap cost. We can achieve the objectives: - Make the lead time of experiments as short as possible. - Make the financial cost to train models as cheap as possible.
  • 27. Agenda - Why do we need Polyaxon? - What is Polyaxon? - How does Polyaxon work? - Demo - Summary
  • 28. Demo - Notebook - Job - Experiment - Hyperparameter tuning at scale
  • 29. Summary - We can definitely achieve the objectives with polyaxon on GKE. - Make the lead time of experiments as short as possible. - Make the financial cost to train models as cheap as possible. - Make the experiments reproducible. - All we ML engineers have to do is: - Making the training code in python as usual, and - Defining the YAML files to do experiments. - What’s next? - Supporting preemptible GPUs / TPUs.
  • 30. Appendix A: Links - Polyaxon - https://polyaxon.com/ - Documentation - https://docs.polyaxon.com/ - Examples - https://github.com/polyaxon/polyaxon-quick-start - https://github.com/polyaxon/deep-learning-with-python-notebooks-on-polyaxon - https://github.com/polyaxon/polyaxon-examples