TITLE: "Creating and Deploying Models with Jupyter, Keras/TensorFlow 2.0 & RedisAI"
SPEAKER: Chris Fregly, Founder and CEO, PipelineAI, a Real-Time Machine Learning and Artificial Intelligence Startup based in San Francisco.
Linkedin: https://linkedin.com/in/cfregly/
ABSTRACT:
In this session, you will learn how to create a Tensorflow 2.0 model using Keras in Jupyter Notebooks. You will also learn how to access models and interact with them using RedisAI.
Meetup: https://www.meetup.com/Advanced-Spark-and-TensorFlow-Meetup/events/261460514/
GitHub: https://github.com/PipelineAI/notebooks
Slides: https://www.slideshare.net/cfregly/pytorch-tensorflow-redisai-streams-advanced-spark-and-tensorflow-meetup-may-25-2019
Video: https://youtu.be/6RYdyU71Fl8
2. • [tensor]werk
• Luca Antiga, Sherin Thomas, Rick Izzo, Pietro Rota
• RedisLabs
• Guy Korland, Itamar Haber, Pieter Cailliau, Meir Shpilraien, Mark
Nunberg, Ariel Madar
• Orobix
• Everyone!
• Manuela Bazzana, Daniele Ciriello, Lisa Lozza, Simone Manini,
Alessandro Re
PRESENTED BY
Acknowledgements
3. • Founder @ PipelineAI
Real-time Machine Learning and AI in Production
• Former Databricks, Netflix
• Apache Spark Contributor
• O’Reilly Author
High Performance TensorFlow in Production
Who Am I? (@cfregly)
4. 1 New to the Bay Area?
2 Familiar with Redis Modules?
3 Familiar with TensorFlow? PyTorch? Others?
Who are you?
4 Familiar with Kubernetes? KubeFlow?
5 Used to work at MapR?!
Wants to hire Antje Barth?6
5. 1 Challenges of Deep Learning in Production
Deep learning and challenges for production
2 Introducing RedisAI
API, architecture, operation
3 Roadmap
What’s next
Agenda:
4 Demo
Live notebook
7. • Python code (Flask, Tornado, …)
• Execution service from cloud provider
• AWS SageMaker
• Google Cloud ML Server
• Runtime
• PipelineAI / Kubernetes
• KubeFlow
• TensorFlow Serving
• NVIDIA TensorRT
• MXNet Model Server
• Bespoke solutions (C++, Java, …)
Model Servers
8. • Must fit the technology stack
• Not just about languages, but about
semantics, scalability, guarantees
• Run anywhere, any size
• Composable building blocks
• Limit the amount of moving parts
• Limit the amount of moving data
• Must make best use of resources
Model Serving Infrastructure
9. 1 Challenges of Deep Learning in Production
Deep learning and challenges for production
2 Introducing RedisAI
API, architecture, ops, client libs
3 Roadmap
What’s next
Agenda:
4 Demo
Live notebook
10. RedisAI
Tensor In
def addsq(a, b):
return (a + b)**2
TorchScript
CPU
GPU0
GPU1
…
• Model Server inside of Redis
• Implemented as a Redis Module
• New data type: Tensor
Tensor Out
11. Advantages of RedisAI Today
• Keeps the data local
• Runs everywhere Redis runs
• Runs multi-backend
• Stay language-independent
• Optimize use of resources
• Keeps models hot
• HA with sentinel, clustering
https://github.com/RedisAI/redisai-examples PRESENTED BY
18. • AI.SCRIPTSET
• AI.SCRIPTRUN
PRESENTED BY
API: Script
myadd2 GPU < addtwo.txtAI.SCRIPTSET
AI.SCRIPTRUN myadd2 addtwo INPUTS foo OUTPUTS bar
def addtwo(a, b):
return a + b
addtwo.txt
19. • SCRIPT is a TorchScript interpreter
• Python-like syntax for tensor ops
• Works with CPU and GPU
• Vast library of tensor operations
• Allows to prescribe computations
directly (without exporting from a
Python env, etc)
• Pre-proc, post-proc (but not only)
Scripts?
Ref: https://pytorch.org/docs/stable/jit.html
PT MODEL
TF MODEL
SCRIPTin_key SCRIPT
PRESENTED BY
out_key
20. • Tensors: framework-agnostic
• Queue + processing thread
• Backpressure
• Redis stays responsive
• Models are kept hot in memory
• Client blocks
Architecture
PRESENTED BY
21. • RDB supported
• Append-Only File (AOF) coming soon
• Tensors are serialized (meta + blob)
• Models are serialized back into
protobuf
• Scripts are serialized as strings
Persistence
PRESENTED BY
22. • Master-replica supported for all data
types
• Right now, run cmds replicated too
(post-conf: replication of results of
computations where appropriate)
• Cluster supported, caveat: sharding
models and scripts
• For the moment, use hash tags
{foo}resnet50 {foo}input1234
Replication
PRESENTED BY
23. • NOTE: any Redis client works right now
• JRedisAI https://github.com/RedisAI/JRedisAI
• redisai-py https://github.com/RedisAI/redisai-py
• Coming up: NodeJS, Go, … (community?)
RedisAI Client Libs
PRESENTED BY
25. 1 Challenges of Deep Learning in Production
Deep learning and challenges for production
2 Introducing RedisAI
API, architecture, operation
3 Roadmap
What’s next
Agenda:
4 Demo
Live notebook
29. Roadmap: DAGRUNASYNC
PRESENTED BY
OUTPUTS label
AI.DAGRUNASYNC
SCRIPTRUN preproc normalize img ~in~
MODELRUN resnet50 INPUTS ~in~ OUTPUTS ~out~
SCRIPTRUN postproc probstolabel INPUTS ~out~
=> 12634
AI.DAGRUNINFO 1264
=> RUNNING [... more details …]
AI.DAGRUNINFO 1264
=> DONE
• AI.DAGRUNASYNC:
• Non-blocking, returns an ID
• ID can be used to later retrieve status + keyspace notification
30. • I stream, you stream, we all stream for Redis Streams!
PRESENTED BY
Roadmap: Streams
preproc normalize instream ~in~
INPUTS ~in~ OUTPUTS
AI.DAGRUN
SCRIPTRUN
MODELRUN
SCRIPTRUN
resnet18
postproc probstolabel INPUTS
~out~
~out~ OUTPUTS outstream
AI.DAGRUNASYNC
preproc normalize instream ~in~
INPUTS ~in~ OUTPUTS
SCRIPTRUN
MODELRUN
SCRIPTRUN
resnet18
postproc probstolabel INPUTS
~out~
~out~ OUTPUTS outstream
31. • Runtime from Microsoft
• ONNX
• exchange format for NN
• export from many frameworks
(MXNet, CNTK, …)
• ONNX-ML
• ONNX for machine learning models
(RandomForest, SVN, K- means, etc)
• export from scikit-learn
Roadmap: ONNX Runtime
PRESENTED BY
32. • Auto-batching
• Transparent batching of requests
• Queue gets rearranged according
to other analogous requests in the
queue, time to response
• Only if different clients or async
• Time-to-response
• ~Predictable in DL graphs
• Can reject request if TTR >
estimated time of queue + run
Roadmap: Auto-batching
concat along 0-th dim
run batch
analyze queue
Run if TTR Reject
< 800ms
Run if TTR OK
< 800ms
PRESENTED BY
A B
33. • Lazy loading of backends:
• Only load what you need
(especially on GPU)
Roadmap: Lazy Loading
TF CPU
TF GPU
PRESENTED BY
LOADBACKEND
LOADBACKEND
LOADBACKEND
LOADBACKEND
TORCH CPU
TORCH GPU
AI.CONFIG
AI.CONFIG
AI.CONFIG
AI.CONFIG
...
34. • Monitoring with AI.INFO:
• more granular information on
running operations, commands,
memory
Roadmap: Monitoring
PRESENTED BY
MODEL key
SCRIPT key
AI.INFO
AI.INFO
AI.INFO
...
35. 1 Challenges of Deep Learning in Production
Deep learning and challenges for production
2 Introducing RedisAI
API, architecture, operation
3 Roadmap
What’s next
Agenda:
4 Demo
Live notebook
36. Demo: PipelineAI + Redis + PyTorch + TensorFlow
PRESENTED BY
• End-to-End Deep Learning from Research to Production
• Any Cloud, Any Framework, Any Hardware
• Free Community Edition: https://community.pipeline.ai