SlideShare uma empresa Scribd logo
1 de 30
Baixar para ler offline
PIPELINE.AI: HIGH PERFORMANCE MODEL
TRAINING & SERVING WITH GPUS…
…AND AWS SAGEMAKER, GOOGLE CLOUD ML,
AZURE ML & KUBERNETES!
CHRIS FREGLY
FOUNDER @ PIPELINE.AI
RECENT PIPELINE.AI NEWS
Sept 2017
Dec 2017
INTRODUCTIONS: ME
§ Chris Fregly, Founder & Engineer @PipelineAI
§ Formerly Netflix, Databricks, IBM Spark Tech
§ Advanced Spark and TensorFlow Meetup
§ Please Join Our 60,000+ Global Members!!
Contact Me
chris@pipeline.ai
@cfregly
Global Locations
* San Francisco
* Chicago
* Austin
* Washington DC
* Dusseldorf
* London
INTRODUCTIONS: YOU
§ Software Engineer, Data Scientist, Data Engineer, Data Analyst
§ Interested in Optimizing and Deploying TF Models to Production
§ Nice to Have a Working Knowledge of TensorFlow (Not Required)
PIPELINE.AI IS 100% OPEN SOURCE
§ https://github.com/PipelineAI/pipeline/
§ Please Star 🌟 this GitHub Repo!
§ Some VC’s Value GitHub Stars @ $15,000 Each (?!)
PIPELINE.AI OVERVIEW
450,000 Docker Downloads
60,000 Users Registered for GA
60,000 Meetup Members
40,000 LinkedIn Followers
2,200 GitHub Stars
12 Enterprise Beta Users
WHY HEAVY FOCUS ON MODEL SERVING?
Model Training
Batch & Boring
Offline in Research Lab
Pipeline Ends at Training
No Insight into Live Production
Small Number of Data Scientists
Optimizations Very Well-Known
Real-Time & Exciting!!
Online in Live Production
Pipeline Extends into Production
Continuous Insight into Live Production
Huuuuuuge Number of Application Users
**Many Optimizations Not Yet Utilized
<<<
Model Serving
100’s Training Jobs per Day 1,000,000’s Predictions per Sec
AGENDA
§ Deploy and Tune Models + Runtimes Safely in Prod
§ Compare Models Both Offline and Online
§ Auto-Shift Traffic to Winning Model or Cloud
§ Live, Continuous Model Training in Production
PACKAGE MODEL + RUNTIME AS ONE
§ Build Model with Runtime into Immutable Docker Image
§ Emphasize Immutable Deployment and Infrastructure
§ Same Runtime Dependencies in All Environments
§ Local, Development, Staging, Production
§ No Library or Dependency Surprises
§ Deploy and Tune Model + Runtime Together
pipeline predict-server-build --model-type=tensorflow 
--model-name=mnist 
--model-tag=A 
--model-path=./models/tensorflow/mnist/
Build Local
Model Server A
LOAD TEST LOCAL MODEL + RUNTIME
§ Perform Mini-Load Test on Local Model Server
§ Immediate, Local Prediction Performance Metrics
§ Compare to Previous Model + Runtime Variations
pipeline predict-server-start --model-type=tensorflow 
--model-name=mnist 
--model-tag=A
pipeline predict --model-endpoint-url=http://localhost:8080 
--test-request-path=test_request.json 
--test-request-concurrency=1000
Load Test Local
Model Server A
Start Local
Model Server A
PUSH IMAGE TO DOCKER REGISTRY
§ Supports All Public + Private Docker Registries
§ DockerHub, Artifactory, Quay, AWS, Google, …
§ Or Self-Hosted, Private Docker Registry
pipeline predict-server-push --image-registry-url=<your-registry> 
--image-registry-repo=<your-repo> 
--model-type=tensorflow 
--model-name=mnist 
--model-tag=A
Push Image To
Docker Registry
CLOUD-BASED OPTIONS
§ AWS SageMaker
§ Released Nov 2017 @ Re-invent
§ Custom Docker Images for Training & Serving ie. PipelineAI Images
§ Distributed TensorFlow Training through Estimator API
§ Traffic Splitting for A/B Model Testing
§ Google Cloud ML Engine
§ Mostly Command-Line Based
§ Driving TensorFlow Open Source API (ie. Experiment API)
§ Azure ML
TUNE MODEL + RUNTIME AS SINGLE UNIT
§ Model Training Optimizations
§ Model Hyper-Parameters (ie. Learning Rate)
§ Reduced Precision (ie. FP16 Half Precision)
§ Post-Training Model Optimizations
§ Quantize Model Weights + Activations From 32-bit to 8-bit
§ Fuse Neural Network Layers Together
§ Model Runtime Optimizations
§ Runtime Configs (ie. Request Batch Size)
§ Different Runtimes (ie. TensorFlow Lite, Nvidia TensorRT)
POST-TRAINING OPTIMIZATIONS
§ Prepare Model for Serving
§ Simplify Network
§ Reduce Model Size
§ Quantize for Fast Matrix Math
§ Some Tools
§ Graph Transform Tool (GTT)
§ tfcompile
After Training
After
Optimizing!
pipeline optimize --optimization-list=[quantize_weights, tfcompile] 
--model-type=tensorflow 
--model-name=mnist 
--model-tag=A 
--model-path=./tensorflow/mnist/model 
--output-path=./tensorflow/mnist/optimized_model
Linear
Regression
RUNTIME OPTION: TENSORFLOW LITE
§ Post-Training Model Optimizations
§ Currently Supports iOS and Android
§ On-Device Prediction Runtime
§ Low-Latency, Fast Startup
§ Selective Operator Loading
§ 70KB Min - 300KB Max Runtime Footprint
§ Supports Accelerators (GPU, TPU)
§ Falls Back to CPU without Accelerator
§ Java and C++ APIs
RUNTIME OPTION: NVIDIA TENSOR-RT
§ Post-Training Model Optimizations
§ Specific to Nvidia GPU
§ GPU-Optimized Prediction Runtime
§ Alternative to TensorFlow Serving
§ PipelineAI Supports TensorRT!
DEPLOY MODELS SAFELY TO PROD
§ Deploy from CLI or Jupyter Notebook
§ Tear-Down or Rollback Models Quickly
§ Shadow Canary Deploy: ie.20% Live Traffic
§ Split Canary Deploy: ie. 97-2-1% Live Traffic
pipeline predict-cluster-start --model-runtime=tflite 
--model-type=tensorflow 
--model-name=mnist 
--model-tag=B 
--traffic-split=2
Start Production
Model Cluster B
pipeline predict-cluster-start --model-runtime=tensorrt 
--model-type=tensorflow 
--model-name=mnist 
--model-tag=C 
--traffic-split=1
Start Production
Model Cluster C
pipeline predict-cluster-start --model-runtime=tfserving_gpu 
--model-type=tensorflow 
--model-name=mnist 
--model-tag=A 
--traffic-split=97
Start Production
Model Cluster A
AGENDA
§ Deploy and Tune Models + Runtimes Safely in Prod
§ Compare Models Both Offline and Online
§ Auto-Shift Traffic to Winning Model or Cloud
§ Live, Continuous Model Training in Production
COMPARE MODELS OFFLINE & ONLINE
§ Offline, Batch Metrics
§ Validation + Training Accuracy
§ CPU + GPU Utilization
§ Live Prediction Values
§ Compare Relative Precision
§ Newly-Seen, Streaming Data
§ Online, Real-Time Metrics
§ Response Time, Throughput
§ Cost ($) Per Prediction
VIEW REAL-TIME PREDICTION STREAM
§ Visually Compare Real-Time Predictions
Prediction
Inputs
Prediction
Results &
Confidences
Model B Model CModel A
PREDICTION PROFILING AND TUNING
§ Pinpoint Performance Bottlenecks
§ Fine-Grained Prediction Metrics
§ 3 Steps in Real-Time Prediction
1. transform_request()
2. predict()
3. transform_response()
AGENDA
§ Deploy and Tune Models + Runtimes Safely in Prod
§ Compare Models Both Offline and Online
§ Auto-Shift Traffic to Winning Model or Cloud
§ Live, Continuous Model Training in Production
LIVE, ADAPTIVE TRAFFIC ROUTING
§ A/B Tests
§ Inflexible and Boring
§ Multi-Armed Bandits
§ Adaptive and Exciting!
pipeline traffic-router-split --model-type=tensorflow 
--model-name=mnist 
--model-tag-list=[A,B,C] 
--model-weight-list=[1,2,97]
Adjust
Traffic Routing
Dynamically
SHIFT TRAFFIC TO MAX(REVENUE)
§ Shift Traffic to Winning Model using AI Bandit Algos
SHIFT TRAFFIC TO MIN(CLOUD CO$T)
§ Based on Cost ($) Per Prediction
§ Cost Changes Throughout Day
§ Lose AWS Spot Instances
§ Google Cloud Becomes Cheaper
§ Shift Across Clouds & On-Prem
AGENDA
§ Deploy and Tune Models + Runtimes Safely in Prod
§ Compare Models Both Offline and Online
§ Auto-Shift Traffic to Winning Model or Cloud
§ Live, Continuous Model Training in Production
LIVE, CONTINUOUS MODEL TRAINING
§ The Holy Grail of Machine Learning
§ Q1 2018: PipelineAI Supports Continuous Model Training!
§ Kafka, Kinesis
§ Spark Streaming
PSEUDO-CONTINUOUS TRAINING
§ Identify and Fix Borderline Predictions (~50-50% Confidence)
§ Fix Along Class Boundaries
§ Retrain Newly-Labeled Data
§ Game-ify Labeling Process
§ Enable Crowd Sourcing
DEMO: TRAIN, DEPLOY, TEST MODEL
§ https://github.com/PipelineAI/pipeline/
§ Please Star 🌟 this GitHub Repo!
pipeline predict-server-build --model-type=tensorflow 
--model-name=mnist 
--model-tag=A 
--model-path=./models/tensorflow/mnist/
THANK YOU!!
§ https://github.com/PipelineAI/pipeline/
§ Please Star 🌟 this GitHub Repo!
§ Reminder: VC’s Value GitHub Stars @ $15,000 Each (!!)
Contact Me
chris@pipeline.ai
@cfregly

Mais conteúdo relacionado

Mais procurados

Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...
Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...
Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...Chris Fregly
 
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...Chris Fregly
 
High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...
High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...
High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...Chris Fregly
 
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...Chris Fregly
 
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...Chris Fregly
 
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...Chris Fregly
 
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...Chris Fregly
 
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017Chris Fregly
 
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Chris Fregly
 
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...Chris Fregly
 
Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on GPUs
Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on GPUsOptimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on GPUs
Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on GPUsChris Fregly
 
Quest for the Perfect Workflow for McrFRED
Quest for the Perfect Workflow for McrFREDQuest for the Perfect Workflow for McrFRED
Quest for the Perfect Workflow for McrFREDAndi Smith
 
Deep Learning - Continuous Operations
Deep Learning - Continuous Operations Deep Learning - Continuous Operations
Deep Learning - Continuous Operations Haggai Philip Zagury
 
Migrating to a Bazel-based CI System: 6 Learnings - Or Shachar
Migrating to a Bazel-based CI System: 6 Learnings - Or ShacharMigrating to a Bazel-based CI System: 6 Learnings - Or Shachar
Migrating to a Bazel-based CI System: 6 Learnings - Or ShacharWix Engineering
 
Mitchell Hashimoto, HashiCorp
Mitchell Hashimoto, HashiCorpMitchell Hashimoto, HashiCorp
Mitchell Hashimoto, HashiCorpOntico
 
Ansible & Cumulus Networks - Simplify Network Automation
Ansible & Cumulus Networks - Simplify Network AutomationAnsible & Cumulus Networks - Simplify Network Automation
Ansible & Cumulus Networks - Simplify Network AutomationCumulus Networks
 
Optimizing Application Performance on Kubernetes
Optimizing Application Performance on KubernetesOptimizing Application Performance on Kubernetes
Optimizing Application Performance on KubernetesDinakar Guniguntala
 
Migrating to a bazel based CI system: 6 learnings
Migrating to a bazel based CI system: 6 learnings Migrating to a bazel based CI system: 6 learnings
Migrating to a bazel based CI system: 6 learnings Or Shachar
 
Apache Camel - FUSE community day London 2010 presentation
Apache Camel - FUSE community day London 2010 presentationApache Camel - FUSE community day London 2010 presentation
Apache Camel - FUSE community day London 2010 presentationClaus Ibsen
 
Using Apache Camel as AKKA
Using Apache Camel as AKKAUsing Apache Camel as AKKA
Using Apache Camel as AKKAJohan Edstrom
 

Mais procurados (20)

Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...
Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...
Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...
 
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
 
High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...
High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...
High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...
 
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
 
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...
 
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
 
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
 
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
 
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
 
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
 
Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on GPUs
Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on GPUsOptimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on GPUs
Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on GPUs
 
Quest for the Perfect Workflow for McrFRED
Quest for the Perfect Workflow for McrFREDQuest for the Perfect Workflow for McrFRED
Quest for the Perfect Workflow for McrFRED
 
Deep Learning - Continuous Operations
Deep Learning - Continuous Operations Deep Learning - Continuous Operations
Deep Learning - Continuous Operations
 
Migrating to a Bazel-based CI System: 6 Learnings - Or Shachar
Migrating to a Bazel-based CI System: 6 Learnings - Or ShacharMigrating to a Bazel-based CI System: 6 Learnings - Or Shachar
Migrating to a Bazel-based CI System: 6 Learnings - Or Shachar
 
Mitchell Hashimoto, HashiCorp
Mitchell Hashimoto, HashiCorpMitchell Hashimoto, HashiCorp
Mitchell Hashimoto, HashiCorp
 
Ansible & Cumulus Networks - Simplify Network Automation
Ansible & Cumulus Networks - Simplify Network AutomationAnsible & Cumulus Networks - Simplify Network Automation
Ansible & Cumulus Networks - Simplify Network Automation
 
Optimizing Application Performance on Kubernetes
Optimizing Application Performance on KubernetesOptimizing Application Performance on Kubernetes
Optimizing Application Performance on Kubernetes
 
Migrating to a bazel based CI system: 6 learnings
Migrating to a bazel based CI system: 6 learnings Migrating to a bazel based CI system: 6 learnings
Migrating to a bazel based CI system: 6 learnings
 
Apache Camel - FUSE community day London 2010 presentation
Apache Camel - FUSE community day London 2010 presentationApache Camel - FUSE community day London 2010 presentation
Apache Camel - FUSE community day London 2010 presentation
 
Using Apache Camel as AKKA
Using Apache Camel as AKKAUsing Apache Camel as AKKA
Using Apache Camel as AKKA
 

Semelhante a PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and Serving - December 2017 - NIPS Conference - LA Big Data and Python Meetups

High Performance Distributed TensorFlow with GPUs and Kubernetes
High Performance Distributed TensorFlow with GPUs and KubernetesHigh Performance Distributed TensorFlow with GPUs and Kubernetes
High Performance Distributed TensorFlow with GPUs and Kubernetesinside-BigData.com
 
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsStijn Decubber
 
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...SQUADEX
 
Cloud Native Applications on OpenShift
Cloud Native Applications on OpenShiftCloud Native Applications on OpenShift
Cloud Native Applications on OpenShiftSerhat Dirik
 
MongoDB World 2019: Why NBCUniversal Migrated to MongoDB Atlas
MongoDB World 2019: Why NBCUniversal Migrated to MongoDB AtlasMongoDB World 2019: Why NBCUniversal Migrated to MongoDB Atlas
MongoDB World 2019: Why NBCUniversal Migrated to MongoDB AtlasMongoDB
 
Apache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's NextApache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's NextPrateek Maheshwari
 
Using Databases and Containers From Development to Deployment
Using Databases and Containers  From Development to DeploymentUsing Databases and Containers  From Development to Deployment
Using Databases and Containers From Development to DeploymentAerospike, Inc.
 
A real-life account of moving 100% to a public cloud
A real-life account of moving 100% to a public cloudA real-life account of moving 100% to a public cloud
A real-life account of moving 100% to a public cloudJulien SIMON
 
AWS Webcast - AWS OpsWorks Continuous Integration Demo
AWS Webcast - AWS OpsWorks Continuous Integration Demo  AWS Webcast - AWS OpsWorks Continuous Integration Demo
AWS Webcast - AWS OpsWorks Continuous Integration Demo Amazon Web Services
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLNordic APIs
 
Integrating Infrastructure as Code into a Continuous Delivery Pipeline | AWS ...
Integrating Infrastructure as Code into a Continuous Delivery Pipeline | AWS ...Integrating Infrastructure as Code into a Continuous Delivery Pipeline | AWS ...
Integrating Infrastructure as Code into a Continuous Delivery Pipeline | AWS ...Amazon Web Services
 
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...Rustem Feyzkhanov
 
Red Hat Agile integration workshop - Atlanta
Red Hat Agile integration workshop - AtlantaRed Hat Agile integration workshop - Atlanta
Red Hat Agile integration workshop - AtlantaJudy Breedlove
 
Agile integration workshop Atlanta
Agile integration workshop   AtlantaAgile integration workshop   Atlanta
Agile integration workshop AtlantaJeremy Davis
 
Low Latency Polyglot Model Scoring using Apache Apex
Low Latency Polyglot Model Scoring using Apache ApexLow Latency Polyglot Model Scoring using Apache Apex
Low Latency Polyglot Model Scoring using Apache ApexApache Apex
 
OSDC 2015: Mitchell Hashimoto | Automating the Modern Datacenter, Development...
OSDC 2015: Mitchell Hashimoto | Automating the Modern Datacenter, Development...OSDC 2015: Mitchell Hashimoto | Automating the Modern Datacenter, Development...
OSDC 2015: Mitchell Hashimoto | Automating the Modern Datacenter, Development...NETWAYS
 

Semelhante a PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and Serving - December 2017 - NIPS Conference - LA Big Data and Python Meetups (20)

High Performance Distributed TensorFlow with GPUs and Kubernetes
High Performance Distributed TensorFlow with GPUs and KubernetesHigh Performance Distributed TensorFlow with GPUs and Kubernetes
High Performance Distributed TensorFlow with GPUs and Kubernetes
 
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
 
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
 
Cloud Native Applications on OpenShift
Cloud Native Applications on OpenShiftCloud Native Applications on OpenShift
Cloud Native Applications on OpenShift
 
MongoDB World 2019: Why NBCUniversal Migrated to MongoDB Atlas
MongoDB World 2019: Why NBCUniversal Migrated to MongoDB AtlasMongoDB World 2019: Why NBCUniversal Migrated to MongoDB Atlas
MongoDB World 2019: Why NBCUniversal Migrated to MongoDB Atlas
 
Apache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's NextApache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's Next
 
Using Databases and Containers From Development to Deployment
Using Databases and Containers  From Development to DeploymentUsing Databases and Containers  From Development to Deployment
Using Databases and Containers From Development to Deployment
 
Twelve Factor App
Twelve Factor AppTwelve Factor App
Twelve Factor App
 
A real-life account of moving 100% to a public cloud
A real-life account of moving 100% to a public cloudA real-life account of moving 100% to a public cloud
A real-life account of moving 100% to a public cloud
 
NoSQL and ACID
NoSQL and ACIDNoSQL and ACID
NoSQL and ACID
 
AWS Webcast - AWS OpsWorks Continuous Integration Demo
AWS Webcast - AWS OpsWorks Continuous Integration Demo  AWS Webcast - AWS OpsWorks Continuous Integration Demo
AWS Webcast - AWS OpsWorks Continuous Integration Demo
 
SD Times - Docker v2
SD Times - Docker v2SD Times - Docker v2
SD Times - Docker v2
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of ML
 
Integrating Infrastructure as Code into a Continuous Delivery Pipeline | AWS ...
Integrating Infrastructure as Code into a Continuous Delivery Pipeline | AWS ...Integrating Infrastructure as Code into a Continuous Delivery Pipeline | AWS ...
Integrating Infrastructure as Code into a Continuous Delivery Pipeline | AWS ...
 
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...
 
TIAD : Automating the modern datacenter
TIAD : Automating the modern datacenterTIAD : Automating the modern datacenter
TIAD : Automating the modern datacenter
 
Red Hat Agile integration workshop - Atlanta
Red Hat Agile integration workshop - AtlantaRed Hat Agile integration workshop - Atlanta
Red Hat Agile integration workshop - Atlanta
 
Agile integration workshop Atlanta
Agile integration workshop   AtlantaAgile integration workshop   Atlanta
Agile integration workshop Atlanta
 
Low Latency Polyglot Model Scoring using Apache Apex
Low Latency Polyglot Model Scoring using Apache ApexLow Latency Polyglot Model Scoring using Apache Apex
Low Latency Polyglot Model Scoring using Apache Apex
 
OSDC 2015: Mitchell Hashimoto | Automating the Modern Datacenter, Development...
OSDC 2015: Mitchell Hashimoto | Automating the Modern Datacenter, Development...OSDC 2015: Mitchell Hashimoto | Automating the Modern Datacenter, Development...
OSDC 2015: Mitchell Hashimoto | Automating the Modern Datacenter, Development...
 

Mais de Chris Fregly

AWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and DataAWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and DataChris Fregly
 
Pandas on AWS - Let me count the ways.pdf
Pandas on AWS - Let me count the ways.pdfPandas on AWS - Let me count the ways.pdf
Pandas on AWS - Let me count the ways.pdfChris Fregly
 
Ray AI Runtime (AIR) on AWS - Data Science On AWS Meetup
Ray AI Runtime (AIR) on AWS - Data Science On AWS MeetupRay AI Runtime (AIR) on AWS - Data Science On AWS Meetup
Ray AI Runtime (AIR) on AWS - Data Science On AWS MeetupChris Fregly
 
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds UpdatedSmokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds UpdatedChris Fregly
 
Amazon reInvent 2020 Recap: AI and Machine Learning
Amazon reInvent 2020 Recap:  AI and Machine LearningAmazon reInvent 2020 Recap:  AI and Machine Learning
Amazon reInvent 2020 Recap: AI and Machine LearningChris Fregly
 
Waking the Data Scientist at 2am: Detect Model Degradation on Production Mod...
Waking the Data Scientist at 2am:  Detect Model Degradation on Production Mod...Waking the Data Scientist at 2am:  Detect Model Degradation on Production Mod...
Waking the Data Scientist at 2am: Detect Model Degradation on Production Mod...Chris Fregly
 
Quantum Computing with Amazon Braket
Quantum Computing with Amazon BraketQuantum Computing with Amazon Braket
Quantum Computing with Amazon BraketChris Fregly
 
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-PersonChris Fregly
 
AWS Re:Invent 2019 Re:Cap
AWS Re:Invent 2019 Re:CapAWS Re:Invent 2019 Re:Cap
AWS Re:Invent 2019 Re:CapChris Fregly
 
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...Chris Fregly
 
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Chris Fregly
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Chris Fregly
 
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...Chris Fregly
 

Mais de Chris Fregly (13)

AWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and DataAWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and Data
 
Pandas on AWS - Let me count the ways.pdf
Pandas on AWS - Let me count the ways.pdfPandas on AWS - Let me count the ways.pdf
Pandas on AWS - Let me count the ways.pdf
 
Ray AI Runtime (AIR) on AWS - Data Science On AWS Meetup
Ray AI Runtime (AIR) on AWS - Data Science On AWS MeetupRay AI Runtime (AIR) on AWS - Data Science On AWS Meetup
Ray AI Runtime (AIR) on AWS - Data Science On AWS Meetup
 
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds UpdatedSmokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
 
Amazon reInvent 2020 Recap: AI and Machine Learning
Amazon reInvent 2020 Recap:  AI and Machine LearningAmazon reInvent 2020 Recap:  AI and Machine Learning
Amazon reInvent 2020 Recap: AI and Machine Learning
 
Waking the Data Scientist at 2am: Detect Model Degradation on Production Mod...
Waking the Data Scientist at 2am:  Detect Model Degradation on Production Mod...Waking the Data Scientist at 2am:  Detect Model Degradation on Production Mod...
Waking the Data Scientist at 2am: Detect Model Degradation on Production Mod...
 
Quantum Computing with Amazon Braket
Quantum Computing with Amazon BraketQuantum Computing with Amazon Braket
Quantum Computing with Amazon Braket
 
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
 
AWS Re:Invent 2019 Re:Cap
AWS Re:Invent 2019 Re:CapAWS Re:Invent 2019 Re:Cap
AWS Re:Invent 2019 Re:Cap
 
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
 
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
 
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
 

Último

WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...masabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxalwaysnagaraju26
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...chiefasafspells
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationJuha-Pekka Tolvanen
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 

Último (20)

Abortion Pill Prices Boksburg [(+27832195400*)] 🏥 Women's Abortion Clinic in ...
Abortion Pill Prices Boksburg [(+27832195400*)] 🏥 Women's Abortion Clinic in ...Abortion Pill Prices Boksburg [(+27832195400*)] 🏥 Women's Abortion Clinic in ...
Abortion Pill Prices Boksburg [(+27832195400*)] 🏥 Women's Abortion Clinic in ...
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 

PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and Serving - December 2017 - NIPS Conference - LA Big Data and Python Meetups

  • 1. PIPELINE.AI: HIGH PERFORMANCE MODEL TRAINING & SERVING WITH GPUS… …AND AWS SAGEMAKER, GOOGLE CLOUD ML, AZURE ML & KUBERNETES! CHRIS FREGLY FOUNDER @ PIPELINE.AI
  • 3. INTRODUCTIONS: ME § Chris Fregly, Founder & Engineer @PipelineAI § Formerly Netflix, Databricks, IBM Spark Tech § Advanced Spark and TensorFlow Meetup § Please Join Our 60,000+ Global Members!! Contact Me chris@pipeline.ai @cfregly Global Locations * San Francisco * Chicago * Austin * Washington DC * Dusseldorf * London
  • 4. INTRODUCTIONS: YOU § Software Engineer, Data Scientist, Data Engineer, Data Analyst § Interested in Optimizing and Deploying TF Models to Production § Nice to Have a Working Knowledge of TensorFlow (Not Required)
  • 5. PIPELINE.AI IS 100% OPEN SOURCE § https://github.com/PipelineAI/pipeline/ § Please Star 🌟 this GitHub Repo! § Some VC’s Value GitHub Stars @ $15,000 Each (?!)
  • 6. PIPELINE.AI OVERVIEW 450,000 Docker Downloads 60,000 Users Registered for GA 60,000 Meetup Members 40,000 LinkedIn Followers 2,200 GitHub Stars 12 Enterprise Beta Users
  • 7. WHY HEAVY FOCUS ON MODEL SERVING? Model Training Batch & Boring Offline in Research Lab Pipeline Ends at Training No Insight into Live Production Small Number of Data Scientists Optimizations Very Well-Known Real-Time & Exciting!! Online in Live Production Pipeline Extends into Production Continuous Insight into Live Production Huuuuuuge Number of Application Users **Many Optimizations Not Yet Utilized <<< Model Serving 100’s Training Jobs per Day 1,000,000’s Predictions per Sec
  • 8. AGENDA § Deploy and Tune Models + Runtimes Safely in Prod § Compare Models Both Offline and Online § Auto-Shift Traffic to Winning Model or Cloud § Live, Continuous Model Training in Production
  • 9. PACKAGE MODEL + RUNTIME AS ONE § Build Model with Runtime into Immutable Docker Image § Emphasize Immutable Deployment and Infrastructure § Same Runtime Dependencies in All Environments § Local, Development, Staging, Production § No Library or Dependency Surprises § Deploy and Tune Model + Runtime Together pipeline predict-server-build --model-type=tensorflow --model-name=mnist --model-tag=A --model-path=./models/tensorflow/mnist/ Build Local Model Server A
  • 10. LOAD TEST LOCAL MODEL + RUNTIME § Perform Mini-Load Test on Local Model Server § Immediate, Local Prediction Performance Metrics § Compare to Previous Model + Runtime Variations pipeline predict-server-start --model-type=tensorflow --model-name=mnist --model-tag=A pipeline predict --model-endpoint-url=http://localhost:8080 --test-request-path=test_request.json --test-request-concurrency=1000 Load Test Local Model Server A Start Local Model Server A
  • 11. PUSH IMAGE TO DOCKER REGISTRY § Supports All Public + Private Docker Registries § DockerHub, Artifactory, Quay, AWS, Google, … § Or Self-Hosted, Private Docker Registry pipeline predict-server-push --image-registry-url=<your-registry> --image-registry-repo=<your-repo> --model-type=tensorflow --model-name=mnist --model-tag=A Push Image To Docker Registry
  • 12. CLOUD-BASED OPTIONS § AWS SageMaker § Released Nov 2017 @ Re-invent § Custom Docker Images for Training & Serving ie. PipelineAI Images § Distributed TensorFlow Training through Estimator API § Traffic Splitting for A/B Model Testing § Google Cloud ML Engine § Mostly Command-Line Based § Driving TensorFlow Open Source API (ie. Experiment API) § Azure ML
  • 13. TUNE MODEL + RUNTIME AS SINGLE UNIT § Model Training Optimizations § Model Hyper-Parameters (ie. Learning Rate) § Reduced Precision (ie. FP16 Half Precision) § Post-Training Model Optimizations § Quantize Model Weights + Activations From 32-bit to 8-bit § Fuse Neural Network Layers Together § Model Runtime Optimizations § Runtime Configs (ie. Request Batch Size) § Different Runtimes (ie. TensorFlow Lite, Nvidia TensorRT)
  • 14. POST-TRAINING OPTIMIZATIONS § Prepare Model for Serving § Simplify Network § Reduce Model Size § Quantize for Fast Matrix Math § Some Tools § Graph Transform Tool (GTT) § tfcompile After Training After Optimizing! pipeline optimize --optimization-list=[quantize_weights, tfcompile] --model-type=tensorflow --model-name=mnist --model-tag=A --model-path=./tensorflow/mnist/model --output-path=./tensorflow/mnist/optimized_model Linear Regression
  • 15. RUNTIME OPTION: TENSORFLOW LITE § Post-Training Model Optimizations § Currently Supports iOS and Android § On-Device Prediction Runtime § Low-Latency, Fast Startup § Selective Operator Loading § 70KB Min - 300KB Max Runtime Footprint § Supports Accelerators (GPU, TPU) § Falls Back to CPU without Accelerator § Java and C++ APIs
  • 16. RUNTIME OPTION: NVIDIA TENSOR-RT § Post-Training Model Optimizations § Specific to Nvidia GPU § GPU-Optimized Prediction Runtime § Alternative to TensorFlow Serving § PipelineAI Supports TensorRT!
  • 17. DEPLOY MODELS SAFELY TO PROD § Deploy from CLI or Jupyter Notebook § Tear-Down or Rollback Models Quickly § Shadow Canary Deploy: ie.20% Live Traffic § Split Canary Deploy: ie. 97-2-1% Live Traffic pipeline predict-cluster-start --model-runtime=tflite --model-type=tensorflow --model-name=mnist --model-tag=B --traffic-split=2 Start Production Model Cluster B pipeline predict-cluster-start --model-runtime=tensorrt --model-type=tensorflow --model-name=mnist --model-tag=C --traffic-split=1 Start Production Model Cluster C pipeline predict-cluster-start --model-runtime=tfserving_gpu --model-type=tensorflow --model-name=mnist --model-tag=A --traffic-split=97 Start Production Model Cluster A
  • 18. AGENDA § Deploy and Tune Models + Runtimes Safely in Prod § Compare Models Both Offline and Online § Auto-Shift Traffic to Winning Model or Cloud § Live, Continuous Model Training in Production
  • 19. COMPARE MODELS OFFLINE & ONLINE § Offline, Batch Metrics § Validation + Training Accuracy § CPU + GPU Utilization § Live Prediction Values § Compare Relative Precision § Newly-Seen, Streaming Data § Online, Real-Time Metrics § Response Time, Throughput § Cost ($) Per Prediction
  • 20. VIEW REAL-TIME PREDICTION STREAM § Visually Compare Real-Time Predictions Prediction Inputs Prediction Results & Confidences Model B Model CModel A
  • 21. PREDICTION PROFILING AND TUNING § Pinpoint Performance Bottlenecks § Fine-Grained Prediction Metrics § 3 Steps in Real-Time Prediction 1. transform_request() 2. predict() 3. transform_response()
  • 22. AGENDA § Deploy and Tune Models + Runtimes Safely in Prod § Compare Models Both Offline and Online § Auto-Shift Traffic to Winning Model or Cloud § Live, Continuous Model Training in Production
  • 23. LIVE, ADAPTIVE TRAFFIC ROUTING § A/B Tests § Inflexible and Boring § Multi-Armed Bandits § Adaptive and Exciting! pipeline traffic-router-split --model-type=tensorflow --model-name=mnist --model-tag-list=[A,B,C] --model-weight-list=[1,2,97] Adjust Traffic Routing Dynamically
  • 24. SHIFT TRAFFIC TO MAX(REVENUE) § Shift Traffic to Winning Model using AI Bandit Algos
  • 25. SHIFT TRAFFIC TO MIN(CLOUD CO$T) § Based on Cost ($) Per Prediction § Cost Changes Throughout Day § Lose AWS Spot Instances § Google Cloud Becomes Cheaper § Shift Across Clouds & On-Prem
  • 26. AGENDA § Deploy and Tune Models + Runtimes Safely in Prod § Compare Models Both Offline and Online § Auto-Shift Traffic to Winning Model or Cloud § Live, Continuous Model Training in Production
  • 27. LIVE, CONTINUOUS MODEL TRAINING § The Holy Grail of Machine Learning § Q1 2018: PipelineAI Supports Continuous Model Training! § Kafka, Kinesis § Spark Streaming
  • 28. PSEUDO-CONTINUOUS TRAINING § Identify and Fix Borderline Predictions (~50-50% Confidence) § Fix Along Class Boundaries § Retrain Newly-Labeled Data § Game-ify Labeling Process § Enable Crowd Sourcing
  • 29. DEMO: TRAIN, DEPLOY, TEST MODEL § https://github.com/PipelineAI/pipeline/ § Please Star 🌟 this GitHub Repo! pipeline predict-server-build --model-type=tensorflow --model-name=mnist --model-tag=A --model-path=./models/tensorflow/mnist/
  • 30. THANK YOU!! § https://github.com/PipelineAI/pipeline/ § Please Star 🌟 this GitHub Repo! § Reminder: VC’s Value GitHub Stars @ $15,000 Each (!!) Contact Me chris@pipeline.ai @cfregly