SlideShare a Scribd company logo
1 of 38
Download to read offline
PyTorch + TensorFlow + RedisAI
Chris Fregly
Founder @
• [tensor]werk
• Luca Antiga, Sherin Thomas, Rick Izzo, Pietro Rota
• RedisLabs
• Guy Korland, Itamar Haber, Pieter Cailliau, Meir Shpilraien, Mark
Nunberg, Ariel Madar
• Orobix
• Everyone!
• Manuela Bazzana, Daniele Ciriello, Lisa Lozza, Simone Manini,
Alessandro Re
PRESENTED BY
Acknowledgements
• Founder @ PipelineAI
Real-time Machine Learning and AI in Production
• Former Databricks, Netflix
• Apache Spark Contributor
• O’Reilly Author
High Performance TensorFlow in Production
Who Am I? (@cfregly)
1 New to the Bay Area?
2 Familiar with Redis Modules?
3 Familiar with TensorFlow? PyTorch? Others?
Who are you?
4 Familiar with Kubernetes? KubeFlow?
5 Used to work at MapR?!
Wants to hire Antje Barth?6
1 Challenges of Deep Learning in Production
Deep learning and challenges for production
2 Introducing RedisAI
API, architecture, operation
3 Roadmap
What’s next
Agenda:
4 Demo
Live notebook
End-to-End Deep Learning in Production
• Python code (Flask, Tornado, …)
• Execution service from cloud provider
• AWS SageMaker
• Google Cloud ML Server
• Runtime
• PipelineAI / Kubernetes
• KubeFlow
• TensorFlow Serving
• NVIDIA TensorRT
• MXNet Model Server
• Bespoke solutions (C++, Java, …)
Model Servers
• Must fit the technology stack
• Not just about languages, but about
semantics, scalability, guarantees
• Run anywhere, any size
• Composable building blocks
• Limit the amount of moving parts
• Limit the amount of moving data
• Must make best use of resources
Model Serving Infrastructure
1 Challenges of Deep Learning in Production
Deep learning and challenges for production
2 Introducing RedisAI
API, architecture, ops, client libs
3 Roadmap
What’s next
Agenda:
4 Demo
Live notebook
RedisAI
Tensor In
def addsq(a, b):
return (a + b)**2
TorchScript
CPU
GPU0
GPU1
…
• Model Server inside of Redis
• Implemented as a Redis Module
• New data type: Tensor
Tensor Out
Advantages of RedisAI Today
• Keeps the data local
• Runs everywhere Redis runs
• Runs multi-backend
• Stay language-independent
• Optimize use of resources
• Keeps models hot
• HA with sentinel, clustering
https://github.com/RedisAI/redisai-examples PRESENTED BY
• AI.TENSORSET
• AI.TENSORGET
PRESENTED BY
API: Tensor
AI.TENSORSET foo FLOAT 2 2 VALUES 1 2 3 4
AI.TENSORSET foo FLOAT 2 2 BLOB < buffer.raw
AI.TENSORGET
AI.TENSORGET
foo
foo
BLOB
VALUES
AI.TENSORGET foo META
• Based on dlpack https://github.com/dmlc/dlpack
• Framework-independent
PRESENTED BY
API: Tensor
typedef
void*
struct {
data;
DLContext ctx;
int ndim;
DLDataType dtype;
int64_t*
int64_t*
uint64_t
shape;
strides;
byte_offset;
} DLTensor;
• AI.MODELSET
API: Model
resnet50 TORCH GPU < foo.pt
AI.MODELSET
AI.MODELSET
resnet50 TF CPU INPUTS in1 OUTPUTS linear4 < foo.pt
• AI.MODELRUN
API: Model
AI.MODELRUN resnet50 INPUTS foo OUTPUTS bar
• TensorFlow (+ Keras): freeze graph
Exporting models
import tensorflow as tf
var_converter = tf.compat.v1.graph_util.convert_variables_to_constants
withtf.Session() as sess:
sess.run([tf.global_variables_initializer()])
frozen_graph = var_converter(sess, sess.graph_def, ['output'])
tf.train.write_graph(frozen_graph, '.', 'resnet50.pb', as_text=False)
https://github.com/RedisAI/redisai-examples/blob/master/models/imagenet/tensorflow/model_saver.py
PRESENTED BY
• PyTorch: JIT model
Exporting models
import torch
batch = torch.randn((1, 3, 224, 224))
traced_model = torch.jit.trace(model, batch)
torch.jit.save(traced_model, 'resnet50.pt')
https://github.com/RedisAI/redisai-examples/blob/master/models/imagenet/pytorch/model_saver.py
PRESENTED BY
• AI.SCRIPTSET
• AI.SCRIPTRUN
PRESENTED BY
API: Script
myadd2 GPU < addtwo.txtAI.SCRIPTSET
AI.SCRIPTRUN myadd2 addtwo INPUTS foo OUTPUTS bar
def addtwo(a, b):
return a + b
addtwo.txt
• SCRIPT is a TorchScript interpreter
• Python-like syntax for tensor ops
• Works with CPU and GPU
• Vast library of tensor operations
• Allows to prescribe computations
directly (without exporting from a
Python env, etc)
• Pre-proc, post-proc (but not only)
Scripts?
Ref: https://pytorch.org/docs/stable/jit.html
PT MODEL
TF MODEL
SCRIPTin_key SCRIPT
PRESENTED BY
out_key
• Tensors: framework-agnostic
• Queue + processing thread
• Backpressure
• Redis stays responsive
• Models are kept hot in memory
• Client blocks
Architecture
PRESENTED BY
• RDB supported
• Append-Only File (AOF) coming soon
• Tensors are serialized (meta + blob)
• Models are serialized back into
protobuf
• Scripts are serialized as strings
Persistence
PRESENTED BY
• Master-replica supported for all data
types
• Right now, run cmds replicated too
(post-conf: replication of results of
computations where appropriate)
• Cluster supported, caveat: sharding
models and scripts
• For the moment, use hash tags
{foo}resnet50 {foo}input1234
Replication
PRESENTED BY
• NOTE: any Redis client works right now
• JRedisAI https://github.com/RedisAI/JRedisAI
• redisai-py https://github.com/RedisAI/redisai-py
• Coming up: NodeJS, Go, … (community?)
RedisAI Client Libs
PRESENTED BY
RedisAI from NodeJS
PRESENTED BY
1 Challenges of Deep Learning in Production
Deep learning and challenges for production
2 Introducing RedisAI
API, architecture, operation
3 Roadmap
What’s next
Agenda:
4 Demo
Live notebook
Roadmap: DAGRUN
PRESENTED BY
preproc normalize img ~in~
AI.DAGRUN
SCRIPTRUN
MODELRUN
SCRIPTRUN
resnet18
postproc
INPUTS ~in~ OUTPUTS
probstolabel INPUTS
~out~
~out~ OUTPUTS label
• DAG = Direct Acyclic Graph
• Atomic operation
• Volatile keys (~x~): command-local, don’t touch keyspace
Roadmap: DAGRUNRO
~img~ FLOAT 1 3
preproc normalize ~img~
224 224 BLOB ...
~in~
~out~
~out~ OUTPUTS ~label~
AI.DAGRUNRO
TENSORSET
SCRIPTRUN
MODELRUN
SCRIPTRUN
TENSORGET
resnet18
postproc
~label~
INPUTS ~in~ OUTPUTS
probstolabel INPUTS
VALUES
• AI.DAGRUNRO:
• if no key is written, replicas can execute in parallel
• errors if commands try to write to non-volatile keys
DAGRUN
PRESENTED BY
DAGRUNRO
DAGRUNRO
DAGRUNRO
Roadmap: Multi-Device Execution
GPU0 ...AI.MODELSET
AI.MODELSET
resnet50a TF
resnet50b TORCH GPU1 ...
AI.TENSORSET img ...
AI.DAGRUN
preproc normalize img ~in~
INPUTS
INPUTS
~in~
~in~
OUTPUTS
OUTPUTS
~out1~
~out2~
ensamble INPUTS ~out1~ ~out2~ OUTPUTS ~out~
SCRIPTRUN
MODELRUN
MODELRUN
SCRIPTRUN
SCRIPTRUN
resnet50a
resnet50b
postproc
postproc probstolabel INPUTS ~out~ OUTPUTS label
• Parallel multi-device execution
• One queue per device (cpu, gpu)
normalize
GPU0 GPU1
ensamble
PRESENTED BY
probstolabels
Roadmap: DAGRUNASYNC
PRESENTED BY
OUTPUTS label
AI.DAGRUNASYNC
SCRIPTRUN preproc normalize img ~in~
MODELRUN resnet50 INPUTS ~in~ OUTPUTS ~out~
SCRIPTRUN postproc probstolabel INPUTS ~out~
=> 12634
AI.DAGRUNINFO 1264
=> RUNNING [... more details …]
AI.DAGRUNINFO 1264
=> DONE
• AI.DAGRUNASYNC:
• Non-blocking, returns an ID
• ID can be used to later retrieve status + keyspace notification
• I stream, you stream, we all stream for Redis Streams!
PRESENTED BY
Roadmap: Streams
preproc normalize instream ~in~
INPUTS ~in~ OUTPUTS
AI.DAGRUN
SCRIPTRUN
MODELRUN
SCRIPTRUN
resnet18
postproc probstolabel INPUTS
~out~
~out~ OUTPUTS outstream
AI.DAGRUNASYNC
preproc normalize instream ~in~
INPUTS ~in~ OUTPUTS
SCRIPTRUN
MODELRUN
SCRIPTRUN
resnet18
postproc probstolabel INPUTS
~out~
~out~ OUTPUTS outstream
• Runtime from Microsoft
• ONNX
• exchange format for NN
• export from many frameworks
(MXNet, CNTK, …)
• ONNX-ML
• ONNX for machine learning models
(RandomForest, SVN, K- means, etc)
• export from scikit-learn
Roadmap: ONNX Runtime
PRESENTED BY
• Auto-batching
• Transparent batching of requests
• Queue gets rearranged according
to other analogous requests in the
queue, time to response
• Only if different clients or async
• Time-to-response
• ~Predictable in DL graphs
• Can reject request if TTR >
estimated time of queue + run
Roadmap: Auto-batching
concat along 0-th dim
run batch
analyze queue
Run if TTR Reject
< 800ms
Run if TTR OK
< 800ms
PRESENTED BY
A B
• Lazy loading of backends:
• Only load what you need
(especially on GPU)
Roadmap: Lazy Loading
TF CPU
TF GPU
PRESENTED BY
LOADBACKEND
LOADBACKEND
LOADBACKEND
LOADBACKEND
TORCH CPU
TORCH GPU
AI.CONFIG
AI.CONFIG
AI.CONFIG
AI.CONFIG
...
• Monitoring with AI.INFO:
• more granular information on
running operations, commands,
memory
Roadmap: Monitoring
PRESENTED BY
MODEL key
SCRIPT key
AI.INFO
AI.INFO
AI.INFO
...
1 Challenges of Deep Learning in Production
Deep learning and challenges for production
2 Introducing RedisAI
API, architecture, operation
3 Roadmap
What’s next
Agenda:
4 Demo
Live notebook
Demo: PipelineAI + Redis + PyTorch + TensorFlow
PRESENTED BY
• End-to-End Deep Learning from Research to Production
• Any Cloud, Any Framework, Any Hardware
• Free Community Edition: https://community.pipeline.ai
• https://github.com/RedisAI/RedisAI
• https://hub.docker.com/u/redisai
PRESENTED BY
GitHub and DockerHub Repos
docker run -p 6379:6379 -it --rm redisai/redisai
Thank you!
https://pipeline.ai
@cfregly (PipelineAI)

More Related Content

More from Chris Fregly

AWS Re:Invent 2019 Re:Cap
AWS Re:Invent 2019 Re:CapAWS Re:Invent 2019 Re:Cap
AWS Re:Invent 2019 Re:CapChris Fregly
 
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...Chris Fregly
 
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Chris Fregly
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Chris Fregly
 
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Chris Fregly
 
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...Chris Fregly
 
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...Chris Fregly
 
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...Chris Fregly
 
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...Chris Fregly
 
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...Chris Fregly
 
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...Chris Fregly
 
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...Chris Fregly
 
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...Chris Fregly
 
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...Chris Fregly
 
Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...
Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...
Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...Chris Fregly
 
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...Chris Fregly
 
Nvidia GPU Tech Conference - Optimizing, Profiling, and Deploying TensorFlow...
Nvidia GPU Tech Conference -  Optimizing, Profiling, and Deploying TensorFlow...Nvidia GPU Tech Conference -  Optimizing, Profiling, and Deploying TensorFlow...
Nvidia GPU Tech Conference - Optimizing, Profiling, and Deploying TensorFlow...Chris Fregly
 
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...Chris Fregly
 
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...Chris Fregly
 
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...Chris Fregly
 

More from Chris Fregly (20)

AWS Re:Invent 2019 Re:Cap
AWS Re:Invent 2019 Re:CapAWS Re:Invent 2019 Re:Cap
AWS Re:Invent 2019 Re:Cap
 
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
 
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
 
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
 
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
 
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
 
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
 
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
 
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
 
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
 
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
 
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
 
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
 
Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...
Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...
Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...
 
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
 
Nvidia GPU Tech Conference - Optimizing, Profiling, and Deploying TensorFlow...
Nvidia GPU Tech Conference -  Optimizing, Profiling, and Deploying TensorFlow...Nvidia GPU Tech Conference -  Optimizing, Profiling, and Deploying TensorFlow...
Nvidia GPU Tech Conference - Optimizing, Profiling, and Deploying TensorFlow...
 
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
 
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...
 
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
 

Recently uploaded

Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Software Coding for software engineering
Software Coding for software engineeringSoftware Coding for software engineering
Software Coding for software engineeringssuserb3a23b
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 

Recently uploaded (20)

Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Odoo Development Company in India | Devintelle Consulting Service
Odoo Development Company in India | Devintelle Consulting ServiceOdoo Development Company in India | Devintelle Consulting Service
Odoo Development Company in India | Devintelle Consulting Service
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Software Coding for software engineering
Software Coding for software engineeringSoftware Coding for software engineering
Software Coding for software engineering
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 

PyTorch + TensorFlow + RedisAI + Streams -- Advanced Spark and TensorFlow Meetup -- May 25 2019

  • 1. PyTorch + TensorFlow + RedisAI Chris Fregly Founder @
  • 2. • [tensor]werk • Luca Antiga, Sherin Thomas, Rick Izzo, Pietro Rota • RedisLabs • Guy Korland, Itamar Haber, Pieter Cailliau, Meir Shpilraien, Mark Nunberg, Ariel Madar • Orobix • Everyone! • Manuela Bazzana, Daniele Ciriello, Lisa Lozza, Simone Manini, Alessandro Re PRESENTED BY Acknowledgements
  • 3. • Founder @ PipelineAI Real-time Machine Learning and AI in Production • Former Databricks, Netflix • Apache Spark Contributor • O’Reilly Author High Performance TensorFlow in Production Who Am I? (@cfregly)
  • 4. 1 New to the Bay Area? 2 Familiar with Redis Modules? 3 Familiar with TensorFlow? PyTorch? Others? Who are you? 4 Familiar with Kubernetes? KubeFlow? 5 Used to work at MapR?! Wants to hire Antje Barth?6
  • 5. 1 Challenges of Deep Learning in Production Deep learning and challenges for production 2 Introducing RedisAI API, architecture, operation 3 Roadmap What’s next Agenda: 4 Demo Live notebook
  • 6. End-to-End Deep Learning in Production
  • 7. • Python code (Flask, Tornado, …) • Execution service from cloud provider • AWS SageMaker • Google Cloud ML Server • Runtime • PipelineAI / Kubernetes • KubeFlow • TensorFlow Serving • NVIDIA TensorRT • MXNet Model Server • Bespoke solutions (C++, Java, …) Model Servers
  • 8. • Must fit the technology stack • Not just about languages, but about semantics, scalability, guarantees • Run anywhere, any size • Composable building blocks • Limit the amount of moving parts • Limit the amount of moving data • Must make best use of resources Model Serving Infrastructure
  • 9. 1 Challenges of Deep Learning in Production Deep learning and challenges for production 2 Introducing RedisAI API, architecture, ops, client libs 3 Roadmap What’s next Agenda: 4 Demo Live notebook
  • 10. RedisAI Tensor In def addsq(a, b): return (a + b)**2 TorchScript CPU GPU0 GPU1 … • Model Server inside of Redis • Implemented as a Redis Module • New data type: Tensor Tensor Out
  • 11. Advantages of RedisAI Today • Keeps the data local • Runs everywhere Redis runs • Runs multi-backend • Stay language-independent • Optimize use of resources • Keeps models hot • HA with sentinel, clustering https://github.com/RedisAI/redisai-examples PRESENTED BY
  • 12. • AI.TENSORSET • AI.TENSORGET PRESENTED BY API: Tensor AI.TENSORSET foo FLOAT 2 2 VALUES 1 2 3 4 AI.TENSORSET foo FLOAT 2 2 BLOB < buffer.raw AI.TENSORGET AI.TENSORGET foo foo BLOB VALUES AI.TENSORGET foo META
  • 13. • Based on dlpack https://github.com/dmlc/dlpack • Framework-independent PRESENTED BY API: Tensor typedef void* struct { data; DLContext ctx; int ndim; DLDataType dtype; int64_t* int64_t* uint64_t shape; strides; byte_offset; } DLTensor;
  • 14. • AI.MODELSET API: Model resnet50 TORCH GPU < foo.pt AI.MODELSET AI.MODELSET resnet50 TF CPU INPUTS in1 OUTPUTS linear4 < foo.pt
  • 15. • AI.MODELRUN API: Model AI.MODELRUN resnet50 INPUTS foo OUTPUTS bar
  • 16. • TensorFlow (+ Keras): freeze graph Exporting models import tensorflow as tf var_converter = tf.compat.v1.graph_util.convert_variables_to_constants withtf.Session() as sess: sess.run([tf.global_variables_initializer()]) frozen_graph = var_converter(sess, sess.graph_def, ['output']) tf.train.write_graph(frozen_graph, '.', 'resnet50.pb', as_text=False) https://github.com/RedisAI/redisai-examples/blob/master/models/imagenet/tensorflow/model_saver.py PRESENTED BY
  • 17. • PyTorch: JIT model Exporting models import torch batch = torch.randn((1, 3, 224, 224)) traced_model = torch.jit.trace(model, batch) torch.jit.save(traced_model, 'resnet50.pt') https://github.com/RedisAI/redisai-examples/blob/master/models/imagenet/pytorch/model_saver.py PRESENTED BY
  • 18. • AI.SCRIPTSET • AI.SCRIPTRUN PRESENTED BY API: Script myadd2 GPU < addtwo.txtAI.SCRIPTSET AI.SCRIPTRUN myadd2 addtwo INPUTS foo OUTPUTS bar def addtwo(a, b): return a + b addtwo.txt
  • 19. • SCRIPT is a TorchScript interpreter • Python-like syntax for tensor ops • Works with CPU and GPU • Vast library of tensor operations • Allows to prescribe computations directly (without exporting from a Python env, etc) • Pre-proc, post-proc (but not only) Scripts? Ref: https://pytorch.org/docs/stable/jit.html PT MODEL TF MODEL SCRIPTin_key SCRIPT PRESENTED BY out_key
  • 20. • Tensors: framework-agnostic • Queue + processing thread • Backpressure • Redis stays responsive • Models are kept hot in memory • Client blocks Architecture PRESENTED BY
  • 21. • RDB supported • Append-Only File (AOF) coming soon • Tensors are serialized (meta + blob) • Models are serialized back into protobuf • Scripts are serialized as strings Persistence PRESENTED BY
  • 22. • Master-replica supported for all data types • Right now, run cmds replicated too (post-conf: replication of results of computations where appropriate) • Cluster supported, caveat: sharding models and scripts • For the moment, use hash tags {foo}resnet50 {foo}input1234 Replication PRESENTED BY
  • 23. • NOTE: any Redis client works right now • JRedisAI https://github.com/RedisAI/JRedisAI • redisai-py https://github.com/RedisAI/redisai-py • Coming up: NodeJS, Go, … (community?) RedisAI Client Libs PRESENTED BY
  • 25. 1 Challenges of Deep Learning in Production Deep learning and challenges for production 2 Introducing RedisAI API, architecture, operation 3 Roadmap What’s next Agenda: 4 Demo Live notebook
  • 26. Roadmap: DAGRUN PRESENTED BY preproc normalize img ~in~ AI.DAGRUN SCRIPTRUN MODELRUN SCRIPTRUN resnet18 postproc INPUTS ~in~ OUTPUTS probstolabel INPUTS ~out~ ~out~ OUTPUTS label • DAG = Direct Acyclic Graph • Atomic operation • Volatile keys (~x~): command-local, don’t touch keyspace
  • 27. Roadmap: DAGRUNRO ~img~ FLOAT 1 3 preproc normalize ~img~ 224 224 BLOB ... ~in~ ~out~ ~out~ OUTPUTS ~label~ AI.DAGRUNRO TENSORSET SCRIPTRUN MODELRUN SCRIPTRUN TENSORGET resnet18 postproc ~label~ INPUTS ~in~ OUTPUTS probstolabel INPUTS VALUES • AI.DAGRUNRO: • if no key is written, replicas can execute in parallel • errors if commands try to write to non-volatile keys DAGRUN PRESENTED BY DAGRUNRO DAGRUNRO DAGRUNRO
  • 28. Roadmap: Multi-Device Execution GPU0 ...AI.MODELSET AI.MODELSET resnet50a TF resnet50b TORCH GPU1 ... AI.TENSORSET img ... AI.DAGRUN preproc normalize img ~in~ INPUTS INPUTS ~in~ ~in~ OUTPUTS OUTPUTS ~out1~ ~out2~ ensamble INPUTS ~out1~ ~out2~ OUTPUTS ~out~ SCRIPTRUN MODELRUN MODELRUN SCRIPTRUN SCRIPTRUN resnet50a resnet50b postproc postproc probstolabel INPUTS ~out~ OUTPUTS label • Parallel multi-device execution • One queue per device (cpu, gpu) normalize GPU0 GPU1 ensamble PRESENTED BY probstolabels
  • 29. Roadmap: DAGRUNASYNC PRESENTED BY OUTPUTS label AI.DAGRUNASYNC SCRIPTRUN preproc normalize img ~in~ MODELRUN resnet50 INPUTS ~in~ OUTPUTS ~out~ SCRIPTRUN postproc probstolabel INPUTS ~out~ => 12634 AI.DAGRUNINFO 1264 => RUNNING [... more details …] AI.DAGRUNINFO 1264 => DONE • AI.DAGRUNASYNC: • Non-blocking, returns an ID • ID can be used to later retrieve status + keyspace notification
  • 30. • I stream, you stream, we all stream for Redis Streams! PRESENTED BY Roadmap: Streams preproc normalize instream ~in~ INPUTS ~in~ OUTPUTS AI.DAGRUN SCRIPTRUN MODELRUN SCRIPTRUN resnet18 postproc probstolabel INPUTS ~out~ ~out~ OUTPUTS outstream AI.DAGRUNASYNC preproc normalize instream ~in~ INPUTS ~in~ OUTPUTS SCRIPTRUN MODELRUN SCRIPTRUN resnet18 postproc probstolabel INPUTS ~out~ ~out~ OUTPUTS outstream
  • 31. • Runtime from Microsoft • ONNX • exchange format for NN • export from many frameworks (MXNet, CNTK, …) • ONNX-ML • ONNX for machine learning models (RandomForest, SVN, K- means, etc) • export from scikit-learn Roadmap: ONNX Runtime PRESENTED BY
  • 32. • Auto-batching • Transparent batching of requests • Queue gets rearranged according to other analogous requests in the queue, time to response • Only if different clients or async • Time-to-response • ~Predictable in DL graphs • Can reject request if TTR > estimated time of queue + run Roadmap: Auto-batching concat along 0-th dim run batch analyze queue Run if TTR Reject < 800ms Run if TTR OK < 800ms PRESENTED BY A B
  • 33. • Lazy loading of backends: • Only load what you need (especially on GPU) Roadmap: Lazy Loading TF CPU TF GPU PRESENTED BY LOADBACKEND LOADBACKEND LOADBACKEND LOADBACKEND TORCH CPU TORCH GPU AI.CONFIG AI.CONFIG AI.CONFIG AI.CONFIG ...
  • 34. • Monitoring with AI.INFO: • more granular information on running operations, commands, memory Roadmap: Monitoring PRESENTED BY MODEL key SCRIPT key AI.INFO AI.INFO AI.INFO ...
  • 35. 1 Challenges of Deep Learning in Production Deep learning and challenges for production 2 Introducing RedisAI API, architecture, operation 3 Roadmap What’s next Agenda: 4 Demo Live notebook
  • 36. Demo: PipelineAI + Redis + PyTorch + TensorFlow PRESENTED BY • End-to-End Deep Learning from Research to Production • Any Cloud, Any Framework, Any Hardware • Free Community Edition: https://community.pipeline.ai
  • 37. • https://github.com/RedisAI/RedisAI • https://hub.docker.com/u/redisai PRESENTED BY GitHub and DockerHub Repos docker run -p 6379:6379 -it --rm redisai/redisai