SlideShare uma empresa Scribd logo
1 de 30
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Model Serving for Deep Learning
©2018 Amazon Web Services, Inc. or its affiliates, All rights reserved
Adrian Hornsby, Technical Evangelist
@adhorn
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What are we talking about?
AI
Machine
Learning
Deep
Learning
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is a Neural Net?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Predicting the price of a house with humans
Price
City
ZipCode Life Quality
Parking
Size
# Room
Accessibility
Family Friendly
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Predicting the price of a house with neural network
Price
City
ZipCode Life Quality
Parking
Size
# Room
Accessibility
Family Friendly
Input Output
Discovered by the neural network
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Deep Learning – Neural Networks
Output
Layer
Input
Layer
Hidden
Layers
Many
More…
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Deep Learning is a Big Deal
It’s able to do better than other ML and Humans
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
https://github.com/precedenceguo/mx-rcnn https://github.com/zhreshold/mxnet-yolo
CNN: Object Detection
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
https://github.com/tornadomeet/mxnet-face
CNN: Face Detection
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
PredNet: Prediction Networks
What comes next
https://coxlab.github.io/prednet/
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
CapsNet: Capsule Networks
Spatial Memory
https://arxiv.org/pdf/1710.09829v1.pdf
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Long Short Term Memory Networks (LSTM)
https://github.com/awslabs/sockeye
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Generative Adversarial Networks (GAN)
The future at work (already) today
Generating new ”celebrity” faces
https://github.com/tkarras/progressive_growing_of_gans
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Personalization Logistics Voice
Autonomous
Vehicles
Deep Learning at Amazon
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How do people ”build” Neural Nets?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Model Zoos & Transfer Learning
• Full implementations of many state-of-the-art models
reported in the academic literature.
• Complete models, with scripts, pre-trained weights and
instructions on how to build and fine tune these models.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
https://www.youtube.com/watch?v=qGotULKg8e0
• Over 10 million images from 300,000 hotels
• Fine-tuned a pre-trained Convolutional Neural Network
using 100,000 images
• Hotel descriptions now automatically feature the best
available images
Expedia
Ranking hotel images using deep learning
https://news.developer.nvidia.com/expedia-ranking-hotel-images-with-deep-learning/
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
So what does a deployed model looks like?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Model
Model Server
Mobile
Desktop
IoT
Internet
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Performance
Availability
Networking
Monitoring
Model Decoupling
Cross Framework
Cross Platform
The Undifferentiated
Heavy Lifting of
Model Serving
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tensor Flow
Serving
Model Server
for MXNet
UC Berkeley
Clipper
Model Serving Systems for Deep Learning
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Model Archive
REST and
OpenAPI
Containerized
ONNX Support Operational Metrics
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Trained
Network
Model
Signature
Custom
Code
Auxiliary
Assets
Model Archive
Model Export CLI
Model Archive
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
REST and OpenAPI
REST-like endpoint: <model-name>/predict
Endpoint auto-generated from the model’s signature.json
JSON encoding by default
Binary input via request payload
OpenAPI support – client code-gen and tooling
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Requests
• Latencies
• Resources
Metrics
• Model Name
• Host Name
Dimensions
• Log / CSV
• AWS CloudWatch
Target
Operational Metrics
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
MMS
Dockerfile
Build
Push
Launch
Containerization
Container Cluster
MMS Container
MMS ContainerMMS Container
MXNet NGINX
MXNet Model Server
Lightweight virtualization, isolation, runs anywhere
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
O(n2)
Pairs
MXNet
Caffe2
PyTorch
TF
CNTKCoreML
TensorRT
NGraph
SNPEMany Frameworks
ONNX Support
(initiative driven by AWS, Facebook and Microsoft)
Many Platforms
ONNX: Common IR
Supported in MMS v0.2
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
It’s Demo Time!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Open source – try it out and file issues
github.com/awslabs/mxnet-model-server
adhorn@amazon.com

Mais conteúdo relacionado

Mais procurados

MBL206_Building Conversational Bot Interfaces with Amazon Lex and AWS Mobile Hub
MBL206_Building Conversational Bot Interfaces with Amazon Lex and AWS Mobile HubMBL206_Building Conversational Bot Interfaces with Amazon Lex and AWS Mobile Hub
MBL206_Building Conversational Bot Interfaces with Amazon Lex and AWS Mobile HubAmazon Web Services
 
Introduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day IsraelIntroduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day IsraelAmazon Web Services
 
GPSBUS216-GPS Applying AI-ML to Find Security Needles in the Haystack
GPSBUS216-GPS Applying AI-ML to Find Security Needles in the HaystackGPSBUS216-GPS Applying AI-ML to Find Security Needles in the Haystack
GPSBUS216-GPS Applying AI-ML to Find Security Needles in the HaystackAmazon Web Services
 
NEW LAUNCH! Amazon Neptune Overview and Customer Use Cases - DAT319 - re:Inve...
NEW LAUNCH! Amazon Neptune Overview and Customer Use Cases - DAT319 - re:Inve...NEW LAUNCH! Amazon Neptune Overview and Customer Use Cases - DAT319 - re:Inve...
NEW LAUNCH! Amazon Neptune Overview and Customer Use Cases - DAT319 - re:Inve...Amazon Web Services
 
WPS204-Effective Emergency Response in AWS.pdf
WPS204-Effective Emergency Response in AWS.pdfWPS204-Effective Emergency Response in AWS.pdf
WPS204-Effective Emergency Response in AWS.pdfAmazon Web Services
 
IOT311_Customer Stories of Things, Cloud, and Analytics on AWS
IOT311_Customer Stories of Things, Cloud, and Analytics on AWSIOT311_Customer Stories of Things, Cloud, and Analytics on AWS
IOT311_Customer Stories of Things, Cloud, and Analytics on AWSAmazon Web Services
 
Enabling Big Data Computing at Pfizer with AWS Service Catalog and AWS Lambda...
Enabling Big Data Computing at Pfizer with AWS Service Catalog and AWS Lambda...Enabling Big Data Computing at Pfizer with AWS Service Catalog and AWS Lambda...
Enabling Big Data Computing at Pfizer with AWS Service Catalog and AWS Lambda...Amazon Web Services
 
MCL303-Deep Learning with Apache MXNet and Gluon
MCL303-Deep Learning with Apache MXNet and GluonMCL303-Deep Learning with Apache MXNet and Gluon
MCL303-Deep Learning with Apache MXNet and GluonAmazon Web Services
 
AI and IoT innovation - an industry focus
AI and IoT innovation - an industry focusAI and IoT innovation - an industry focus
AI and IoT innovation - an industry focusAmazon Web Services
 
EUT303_Modernizing the Energy and Utilities Industry with IoT Moving SCADA to...
EUT303_Modernizing the Energy and Utilities Industry with IoT Moving SCADA to...EUT303_Modernizing the Energy and Utilities Industry with IoT Moving SCADA to...
EUT303_Modernizing the Energy and Utilities Industry with IoT Moving SCADA to...Amazon Web Services
 
NEW LAUNCH! Realtime and Offline application development using GraphQL with A...
NEW LAUNCH! Realtime and Offline application development using GraphQL with A...NEW LAUNCH! Realtime and Offline application development using GraphQL with A...
NEW LAUNCH! Realtime and Offline application development using GraphQL with A...Amazon Web Services
 
How can your business benefit from going Serverless
How can your business benefit from going ServerlessHow can your business benefit from going Serverless
How can your business benefit from going ServerlessAmazon Web Services
 
GPSBUS223-Starting Out with the AWS Partner Network
GPSBUS223-Starting Out with the AWS Partner NetworkGPSBUS223-Starting Out with the AWS Partner Network
GPSBUS223-Starting Out with the AWS Partner NetworkAmazon Web Services
 
BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...
BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...
BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...Amazon Web Services
 
GPSBUS213-Success in the Public Sector Market
GPSBUS213-Success in the Public Sector MarketGPSBUS213-Success in the Public Sector Market
GPSBUS213-Success in the Public Sector MarketAmazon Web Services
 
AWS reInvent Recap 線上研討會
AWS reInvent Recap 線上研討會AWS reInvent Recap 線上研討會
AWS reInvent Recap 線上研討會Amazon Web Services
 

Mais procurados (20)

MBL206_Building Conversational Bot Interfaces with Amazon Lex and AWS Mobile Hub
MBL206_Building Conversational Bot Interfaces with Amazon Lex and AWS Mobile HubMBL206_Building Conversational Bot Interfaces with Amazon Lex and AWS Mobile Hub
MBL206_Building Conversational Bot Interfaces with Amazon Lex and AWS Mobile Hub
 
IOT315_AWS IoT Rules Engine
IOT315_AWS IoT Rules EngineIOT315_AWS IoT Rules Engine
IOT315_AWS IoT Rules Engine
 
Introduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day IsraelIntroduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day Israel
 
GPSBUS216-GPS Applying AI-ML to Find Security Needles in the Haystack
GPSBUS216-GPS Applying AI-ML to Find Security Needles in the HaystackGPSBUS216-GPS Applying AI-ML to Find Security Needles in the Haystack
GPSBUS216-GPS Applying AI-ML to Find Security Needles in the Haystack
 
NEW LAUNCH! Amazon Neptune Overview and Customer Use Cases - DAT319 - re:Inve...
NEW LAUNCH! Amazon Neptune Overview and Customer Use Cases - DAT319 - re:Inve...NEW LAUNCH! Amazon Neptune Overview and Customer Use Cases - DAT319 - re:Inve...
NEW LAUNCH! Amazon Neptune Overview and Customer Use Cases - DAT319 - re:Inve...
 
MCL335_Rhythm
MCL335_RhythmMCL335_Rhythm
MCL335_Rhythm
 
WPS204-Effective Emergency Response in AWS.pdf
WPS204-Effective Emergency Response in AWS.pdfWPS204-Effective Emergency Response in AWS.pdf
WPS204-Effective Emergency Response in AWS.pdf
 
IOT311_Customer Stories of Things, Cloud, and Analytics on AWS
IOT311_Customer Stories of Things, Cloud, and Analytics on AWSIOT311_Customer Stories of Things, Cloud, and Analytics on AWS
IOT311_Customer Stories of Things, Cloud, and Analytics on AWS
 
ALX328_Smart Devices Everywhere
ALX328_Smart Devices EverywhereALX328_Smart Devices Everywhere
ALX328_Smart Devices Everywhere
 
Enabling Big Data Computing at Pfizer with AWS Service Catalog and AWS Lambda...
Enabling Big Data Computing at Pfizer with AWS Service Catalog and AWS Lambda...Enabling Big Data Computing at Pfizer with AWS Service Catalog and AWS Lambda...
Enabling Big Data Computing at Pfizer with AWS Service Catalog and AWS Lambda...
 
MCL303-Deep Learning with Apache MXNet and Gluon
MCL303-Deep Learning with Apache MXNet and GluonMCL303-Deep Learning with Apache MXNet and Gluon
MCL303-Deep Learning with Apache MXNet and Gluon
 
AI and IoT innovation - an industry focus
AI and IoT innovation - an industry focusAI and IoT innovation - an industry focus
AI and IoT innovation - an industry focus
 
EUT303_Modernizing the Energy and Utilities Industry with IoT Moving SCADA to...
EUT303_Modernizing the Energy and Utilities Industry with IoT Moving SCADA to...EUT303_Modernizing the Energy and Utilities Industry with IoT Moving SCADA to...
EUT303_Modernizing the Energy and Utilities Industry with IoT Moving SCADA to...
 
NEW LAUNCH! Realtime and Offline application development using GraphQL with A...
NEW LAUNCH! Realtime and Offline application development using GraphQL with A...NEW LAUNCH! Realtime and Offline application development using GraphQL with A...
NEW LAUNCH! Realtime and Offline application development using GraphQL with A...
 
How can your business benefit from going Serverless
How can your business benefit from going ServerlessHow can your business benefit from going Serverless
How can your business benefit from going Serverless
 
GPSBUS223-Starting Out with the AWS Partner Network
GPSBUS223-Starting Out with the AWS Partner NetworkGPSBUS223-Starting Out with the AWS Partner Network
GPSBUS223-Starting Out with the AWS Partner Network
 
BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...
BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...
BAP307_Use Amazon Lex to Build a Customer Service Chatbot in Your Amazon Conn...
 
GPSBUS213-Success in the Public Sector Market
GPSBUS213-Success in the Public Sector MarketGPSBUS213-Success in the Public Sector Market
GPSBUS213-Success in the Public Sector Market
 
MAE301_Boom for your Buck
MAE301_Boom for your BuckMAE301_Boom for your Buck
MAE301_Boom for your Buck
 
AWS reInvent Recap 線上研討會
AWS reInvent Recap 線上研討會AWS reInvent Recap 線上研討會
AWS reInvent Recap 線上研討會
 

Semelhante a Model Serving for Deep Learning

Deep learning systems model serving
Deep learning systems   model servingDeep learning systems   model serving
Deep learning systems model servingHagay Lupesko
 
Innovations fueled by IoT and the Cloud
Innovations fueled by IoT and the CloudInnovations fueled by IoT and the Cloud
Innovations fueled by IoT and the CloudAdrian Hornsby
 
Devoxx: Building AI-powered applications on AWS
Devoxx: Building AI-powered applications on AWSDevoxx: Building AI-powered applications on AWS
Devoxx: Building AI-powered applications on AWSAdrian Hornsby
 
Model Serving for Deep Learning with MXNet Model Server
Model Serving for Deep Learning with MXNet Model ServerModel Serving for Deep Learning with MXNet Model Server
Model Serving for Deep Learning with MXNet Model ServerAmazon Web Services
 
Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...
Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...
Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...Amazon Web Services
 
DVC303-Technological Accelerants for Organizational Transformation
DVC303-Technological Accelerants for Organizational TransformationDVC303-Technological Accelerants for Organizational Transformation
DVC303-Technological Accelerants for Organizational TransformationAmazon Web Services
 
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und ExpertenMaschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und ExpertenAWS Germany
 
Artificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartArtificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartVladimir Simek
 
Artificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartArtificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartVladimir Simek
 
AI / ML Services - re:Invent Comes to London 2.0
AI / ML Services - re:Invent Comes to London 2.0AI / ML Services - re:Invent Comes to London 2.0
AI / ML Services - re:Invent Comes to London 2.0Amazon Web Services
 
GPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital MarketsGPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital MarketsAmazon Web Services
 
NEW LAUNCH! Push Intelligence to the edge with Greengrass - IOT209 - re:Inven...
NEW LAUNCH! Push Intelligence to the edge with Greengrass - IOT209 - re:Inven...NEW LAUNCH! Push Intelligence to the edge with Greengrass - IOT209 - re:Inven...
NEW LAUNCH! Push Intelligence to the edge with Greengrass - IOT209 - re:Inven...Amazon Web Services
 
What is deep learning (and why you should care) - Talk at SJSU Oct 2018
What is deep learning (and why you should care) - Talk at SJSU Oct 2018What is deep learning (and why you should care) - Talk at SJSU Oct 2018
What is deep learning (and why you should care) - Talk at SJSU Oct 2018Hagay Lupesko
 
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...Amazon Web Services
 
GPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting PartnersGPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting PartnersAmazon Web Services
 
Reactive Architectures with Microservices
Reactive Architectures with MicroservicesReactive Architectures with Microservices
Reactive Architectures with MicroservicesAWS Germany
 
CON203_Driving Innovation with Containers
CON203_Driving Innovation with ContainersCON203_Driving Innovation with Containers
CON203_Driving Innovation with ContainersAmazon Web Services
 
Driving Innovation with Containers - CON203 - re:Invent 2017
Driving Innovation with Containers - CON203 - re:Invent 2017Driving Innovation with Containers - CON203 - re:Invent 2017
Driving Innovation with Containers - CON203 - re:Invent 2017Amazon Web Services
 
CMP314_Bringing Deep Learning to the Cloud with Amazon EC2
CMP314_Bringing Deep Learning to the Cloud with Amazon EC2CMP314_Bringing Deep Learning to the Cloud with Amazon EC2
CMP314_Bringing Deep Learning to the Cloud with Amazon EC2Amazon Web Services
 

Semelhante a Model Serving for Deep Learning (20)

Deep learning systems model serving
Deep learning systems   model servingDeep learning systems   model serving
Deep learning systems model serving
 
Innovations fueled by IoT and the Cloud
Innovations fueled by IoT and the CloudInnovations fueled by IoT and the Cloud
Innovations fueled by IoT and the Cloud
 
Devoxx: Building AI-powered applications on AWS
Devoxx: Building AI-powered applications on AWSDevoxx: Building AI-powered applications on AWS
Devoxx: Building AI-powered applications on AWS
 
Model Serving for Deep Learning with MXNet Model Server
Model Serving for Deep Learning with MXNet Model ServerModel Serving for Deep Learning with MXNet Model Server
Model Serving for Deep Learning with MXNet Model Server
 
Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...
Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...
Technological Accelerants for Organizational Transformation - DVC303 - re:Inv...
 
DVC303-Technological Accelerants for Organizational Transformation
DVC303-Technological Accelerants for Organizational TransformationDVC303-Technological Accelerants for Organizational Transformation
DVC303-Technological Accelerants for Organizational Transformation
 
Moving Forward with AI
Moving Forward with AIMoving Forward with AI
Moving Forward with AI
 
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und ExpertenMaschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
 
Artificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartArtificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to Start
 
Artificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartArtificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to Start
 
AI / ML Services - re:Invent Comes to London 2.0
AI / ML Services - re:Invent Comes to London 2.0AI / ML Services - re:Invent Comes to London 2.0
AI / ML Services - re:Invent Comes to London 2.0
 
GPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital MarketsGPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital Markets
 
NEW LAUNCH! Push Intelligence to the edge with Greengrass - IOT209 - re:Inven...
NEW LAUNCH! Push Intelligence to the edge with Greengrass - IOT209 - re:Inven...NEW LAUNCH! Push Intelligence to the edge with Greengrass - IOT209 - re:Inven...
NEW LAUNCH! Push Intelligence to the edge with Greengrass - IOT209 - re:Inven...
 
What is deep learning (and why you should care) - Talk at SJSU Oct 2018
What is deep learning (and why you should care) - Talk at SJSU Oct 2018What is deep learning (and why you should care) - Talk at SJSU Oct 2018
What is deep learning (and why you should care) - Talk at SJSU Oct 2018
 
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
 
GPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting PartnersGPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
 
Reactive Architectures with Microservices
Reactive Architectures with MicroservicesReactive Architectures with Microservices
Reactive Architectures with Microservices
 
CON203_Driving Innovation with Containers
CON203_Driving Innovation with ContainersCON203_Driving Innovation with Containers
CON203_Driving Innovation with Containers
 
Driving Innovation with Containers - CON203 - re:Invent 2017
Driving Innovation with Containers - CON203 - re:Invent 2017Driving Innovation with Containers - CON203 - re:Invent 2017
Driving Innovation with Containers - CON203 - re:Invent 2017
 
CMP314_Bringing Deep Learning to the Cloud with Amazon EC2
CMP314_Bringing Deep Learning to the Cloud with Amazon EC2CMP314_Bringing Deep Learning to the Cloud with Amazon EC2
CMP314_Bringing Deep Learning to the Cloud with Amazon EC2
 

Mais de Adrian Hornsby

Chaos Engineering: Why Breaking Things Should Be Practised.
Chaos Engineering: Why Breaking Things Should Be Practised.Chaos Engineering: Why Breaking Things Should Be Practised.
Chaos Engineering: Why Breaking Things Should Be Practised.Adrian Hornsby
 
AI in Finance: Moving forward!
AI in Finance: Moving forward!AI in Finance: Moving forward!
AI in Finance: Moving forward!Adrian Hornsby
 
Building a Multi-Region, Active-Active Serverless Backends.
Building a Multi-Region, Active-Active Serverless Backends.Building a Multi-Region, Active-Active Serverless Backends.
Building a Multi-Region, Active-Active Serverless Backends.Adrian Hornsby
 
Moving Forward with AI
Moving Forward with AIMoving Forward with AI
Moving Forward with AIAdrian Hornsby
 
AI: State of the Union
AI: State of the UnionAI: State of the Union
AI: State of the UnionAdrian Hornsby
 
re:Invent re:Cap - An overview of Artificial Intelligence and Machine Learnin...
re:Invent re:Cap - An overview of Artificial Intelligence and Machine Learnin...re:Invent re:Cap - An overview of Artificial Intelligence and Machine Learnin...
re:Invent re:Cap - An overview of Artificial Intelligence and Machine Learnin...Adrian Hornsby
 
re:Invent re:Cap - Big Data & IoT at Any Scale
re:Invent re:Cap - Big Data & IoT at Any Scalere:Invent re:Cap - Big Data & IoT at Any Scale
re:Invent re:Cap - Big Data & IoT at Any ScaleAdrian Hornsby
 
Innovations and the Cloud
Innovations and the CloudInnovations and the Cloud
Innovations and the CloudAdrian Hornsby
 
Serverless in Action on AWS
Serverless in Action on AWSServerless in Action on AWS
Serverless in Action on AWSAdrian Hornsby
 
Innovations and The Cloud
Innovations and The CloudInnovations and The Cloud
Innovations and The CloudAdrian Hornsby
 
10 Lessons from 10 Years of AWS
10 Lessons from 10 Years of AWS10 Lessons from 10 Years of AWS
10 Lessons from 10 Years of AWSAdrian Hornsby
 
Developing Sophisticated Serverless Applications with AI
Developing Sophisticated Serverless Applications with AIDeveloping Sophisticated Serverless Applications with AI
Developing Sophisticated Serverless Applications with AIAdrian Hornsby
 
AWS Startup Day Bangalore: Being Well-Architected in the Cloud
AWS Startup Day Bangalore: Being Well-Architected in the CloudAWS Startup Day Bangalore: Being Well-Architected in the Cloud
AWS Startup Day Bangalore: Being Well-Architected in the CloudAdrian Hornsby
 
Journey Towards Scaling Your API to 10 Million Users
Journey Towards Scaling Your API to 10 Million UsersJourney Towards Scaling Your API to 10 Million Users
Journey Towards Scaling Your API to 10 Million UsersAdrian Hornsby
 
AWSome Day - Opening Keynote
AWSome Day - Opening KeynoteAWSome Day - Opening Keynote
AWSome Day - Opening KeynoteAdrian Hornsby
 
Building AI-powered Serverless Applications on AWS
Building AI-powered Serverless Applications on AWSBuilding AI-powered Serverless Applications on AWS
Building AI-powered Serverless Applications on AWSAdrian Hornsby
 
AWS Batch: Simplifying batch computing in the cloud
AWS Batch: Simplifying batch computing in the cloudAWS Batch: Simplifying batch computing in the cloud
AWS Batch: Simplifying batch computing in the cloudAdrian Hornsby
 
Being Well Architected in the Cloud (Updated)
Being Well Architected in the Cloud (Updated)Being Well Architected in the Cloud (Updated)
Being Well Architected in the Cloud (Updated)Adrian Hornsby
 
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
Deep Dive on Object Storage: Amazon S3 and Amazon GlacierDeep Dive on Object Storage: Amazon S3 and Amazon Glacier
Deep Dive on Object Storage: Amazon S3 and Amazon GlacierAdrian Hornsby
 
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsAdrian Hornsby
 

Mais de Adrian Hornsby (20)

Chaos Engineering: Why Breaking Things Should Be Practised.
Chaos Engineering: Why Breaking Things Should Be Practised.Chaos Engineering: Why Breaking Things Should Be Practised.
Chaos Engineering: Why Breaking Things Should Be Practised.
 
AI in Finance: Moving forward!
AI in Finance: Moving forward!AI in Finance: Moving forward!
AI in Finance: Moving forward!
 
Building a Multi-Region, Active-Active Serverless Backends.
Building a Multi-Region, Active-Active Serverless Backends.Building a Multi-Region, Active-Active Serverless Backends.
Building a Multi-Region, Active-Active Serverless Backends.
 
Moving Forward with AI
Moving Forward with AIMoving Forward with AI
Moving Forward with AI
 
AI: State of the Union
AI: State of the UnionAI: State of the Union
AI: State of the Union
 
re:Invent re:Cap - An overview of Artificial Intelligence and Machine Learnin...
re:Invent re:Cap - An overview of Artificial Intelligence and Machine Learnin...re:Invent re:Cap - An overview of Artificial Intelligence and Machine Learnin...
re:Invent re:Cap - An overview of Artificial Intelligence and Machine Learnin...
 
re:Invent re:Cap - Big Data & IoT at Any Scale
re:Invent re:Cap - Big Data & IoT at Any Scalere:Invent re:Cap - Big Data & IoT at Any Scale
re:Invent re:Cap - Big Data & IoT at Any Scale
 
Innovations and the Cloud
Innovations and the CloudInnovations and the Cloud
Innovations and the Cloud
 
Serverless in Action on AWS
Serverless in Action on AWSServerless in Action on AWS
Serverless in Action on AWS
 
Innovations and The Cloud
Innovations and The CloudInnovations and The Cloud
Innovations and The Cloud
 
10 Lessons from 10 Years of AWS
10 Lessons from 10 Years of AWS10 Lessons from 10 Years of AWS
10 Lessons from 10 Years of AWS
 
Developing Sophisticated Serverless Applications with AI
Developing Sophisticated Serverless Applications with AIDeveloping Sophisticated Serverless Applications with AI
Developing Sophisticated Serverless Applications with AI
 
AWS Startup Day Bangalore: Being Well-Architected in the Cloud
AWS Startup Day Bangalore: Being Well-Architected in the CloudAWS Startup Day Bangalore: Being Well-Architected in the Cloud
AWS Startup Day Bangalore: Being Well-Architected in the Cloud
 
Journey Towards Scaling Your API to 10 Million Users
Journey Towards Scaling Your API to 10 Million UsersJourney Towards Scaling Your API to 10 Million Users
Journey Towards Scaling Your API to 10 Million Users
 
AWSome Day - Opening Keynote
AWSome Day - Opening KeynoteAWSome Day - Opening Keynote
AWSome Day - Opening Keynote
 
Building AI-powered Serverless Applications on AWS
Building AI-powered Serverless Applications on AWSBuilding AI-powered Serverless Applications on AWS
Building AI-powered Serverless Applications on AWS
 
AWS Batch: Simplifying batch computing in the cloud
AWS Batch: Simplifying batch computing in the cloudAWS Batch: Simplifying batch computing in the cloud
AWS Batch: Simplifying batch computing in the cloud
 
Being Well Architected in the Cloud (Updated)
Being Well Architected in the Cloud (Updated)Being Well Architected in the Cloud (Updated)
Being Well Architected in the Cloud (Updated)
 
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
Deep Dive on Object Storage: Amazon S3 and Amazon GlacierDeep Dive on Object Storage: Amazon S3 and Amazon Glacier
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
 
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis Analytics
 

Último

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 

Último (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 

Model Serving for Deep Learning

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Model Serving for Deep Learning ©2018 Amazon Web Services, Inc. or its affiliates, All rights reserved Adrian Hornsby, Technical Evangelist @adhorn
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What are we talking about? AI Machine Learning Deep Learning
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What is a Neural Net?
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Predicting the price of a house with humans Price City ZipCode Life Quality Parking Size # Room Accessibility Family Friendly
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Predicting the price of a house with neural network Price City ZipCode Life Quality Parking Size # Room Accessibility Family Friendly Input Output Discovered by the neural network
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Deep Learning – Neural Networks Output Layer Input Layer Hidden Layers Many More…
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Deep Learning is a Big Deal It’s able to do better than other ML and Humans
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. https://github.com/precedenceguo/mx-rcnn https://github.com/zhreshold/mxnet-yolo CNN: Object Detection
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. https://github.com/tornadomeet/mxnet-face CNN: Face Detection
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. PredNet: Prediction Networks What comes next https://coxlab.github.io/prednet/
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. CapsNet: Capsule Networks Spatial Memory https://arxiv.org/pdf/1710.09829v1.pdf
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Long Short Term Memory Networks (LSTM) https://github.com/awslabs/sockeye
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Generative Adversarial Networks (GAN) The future at work (already) today Generating new ”celebrity” faces https://github.com/tkarras/progressive_growing_of_gans
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Personalization Logistics Voice Autonomous Vehicles Deep Learning at Amazon
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How do people ”build” Neural Nets?
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Model Zoos & Transfer Learning • Full implementations of many state-of-the-art models reported in the academic literature. • Complete models, with scripts, pre-trained weights and instructions on how to build and fine tune these models.
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. https://www.youtube.com/watch?v=qGotULKg8e0 • Over 10 million images from 300,000 hotels • Fine-tuned a pre-trained Convolutional Neural Network using 100,000 images • Hotel descriptions now automatically feature the best available images Expedia Ranking hotel images using deep learning https://news.developer.nvidia.com/expedia-ranking-hotel-images-with-deep-learning/
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. So what does a deployed model looks like?
  • 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Model Model Server Mobile Desktop IoT Internet
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Performance Availability Networking Monitoring Model Decoupling Cross Framework Cross Platform The Undifferentiated Heavy Lifting of Model Serving
  • 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tensor Flow Serving Model Server for MXNet UC Berkeley Clipper Model Serving Systems for Deep Learning
  • 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Model Archive REST and OpenAPI Containerized ONNX Support Operational Metrics
  • 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Trained Network Model Signature Custom Code Auxiliary Assets Model Archive Model Export CLI Model Archive Back
  • 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. REST and OpenAPI REST-like endpoint: <model-name>/predict Endpoint auto-generated from the model’s signature.json JSON encoding by default Binary input via request payload OpenAPI support – client code-gen and tooling Back
  • 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Requests • Latencies • Resources Metrics • Model Name • Host Name Dimensions • Log / CSV • AWS CloudWatch Target Operational Metrics Back
  • 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. MMS Dockerfile Build Push Launch Containerization Container Cluster MMS Container MMS ContainerMMS Container MXNet NGINX MXNet Model Server Lightweight virtualization, isolation, runs anywhere Back
  • 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. O(n2) Pairs MXNet Caffe2 PyTorch TF CNTKCoreML TensorRT NGraph SNPEMany Frameworks ONNX Support (initiative driven by AWS, Facebook and Microsoft) Many Platforms ONNX: Common IR Supported in MMS v0.2
  • 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. It’s Demo Time!
  • 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Open source – try it out and file issues github.com/awslabs/mxnet-model-server adhorn@amazon.com

Notas do Editor

  1. Hi everyone! My name is Adrian Hornsby, I’m an technical evangelist at AWS , and one of my focus area is AI and especially Deep Learning. Today I’m going to talk about model serving. It’s a super interesting domain within Deep Learning, and I hope you will enjoy learning more about it. If you want to chat more - I’ll be here after the talk so feel free to drop by!
  2. With a show of hands – How many of you know what Deep Learning is? How many have ever implemented a neural network? How many have deployed one to production? OK – so we have fair knowledge of DL. In this talk we will not dive into the details of DNNs, since this Is not the topic of this talk, nor do we have the time… But I will briefly discuss it to set the right context. So Deep Learning is a field within Machine Learning, which is by itself a field within AI. AI is the set of technique that enables computers to mimic, and surpass, human intelligence ML is a subset of AI, and is the set of mostly statistical techniques that enables computers to improve with experience – hence “learning” DL is a subset of ML, a technique inspired by the human brain – or neurons to be more exact – that uses interconnected artificial neurons to learn from samples.
  3. So at the base of Deep Learning we have the Neural Network. Let’s briefly see what these networks look like. So a neural network at its most simplistic form is composed of layers, each consisting of a set of neurons, that are interconnected across layers with weighted edges. The term “deep learning” was coined due to these networks having many hidden layers, which makes them “deep”. The network takes the input vector, matrix, or more generally tensor, and feeds every element of the input into a unit in the input layer. From there the computation cascades across the units and layers, until we get an output in the output layer. Neural networks are non linear functions, and can learn non linear features, as the activation functions in each neuron is non linear. They enable learning features in a hierarchical way, with each layer learning a feature that is leveraging the features learned in the previous layer. And very importantly: it is a scalable architecture that can be made more complex with more learning capabilities by enlarging the network and/or modifying the operators in neurons. And it is typically very heavy computationally. Modern networks such as resnet-152, which has 152 layers, requires 11GFLOPS for a single forward pass.
  4. Beyond the growing usage of DL in applications and devices around us, there is another interesting aspect to deep learning, and that is how well it does compared to the dominant species on this planet: us! One of the first areas Deep Learning was able to demonstrate state of the art results, was in the domain of Machine Vision. A classical problem in that domain is Object Classification: given an image, identify the most prominent object in that image out of a set of pre-defined classes. A DNN presented in 2012 by Alex Krizhevsky, was able to leap-frog the best known algo to date by over 30%. That was really a major leap, and since then every year the best algorithm for Object Classification, and many other Vision tasks, are based on Deep Learning, with results that keep on getting better. Research paper by Geirhos from 2017 shows that DNNs already outperform humans in Object Classification – a task us humans have been programmed to specialize in by evolution. The paper also shows that human vision actually performs better when noise is introduced – it may make you feel better, it worked for me  AlexNet paper: https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf Humans vs DNNs paper: https://arxiv.org/pdf/1706.06969.pdf
  5. The PredNet is a deep convolutional recurrent neural network inspired by the principles of predictive coding from the neuroscience literature
  6. A bit about why Deep Learning is a big deal You can see Deep Learning applied in more and more domains, with a growing impact on our lives. If you look at the breadth of AI applied within Amazon alone, you can see DL in the Retail Website within personalization and recs, you can see it optimizing Amazon’s logistics, you probably noticed the boom voice-enabled personal assistants, and you may have heard that Amazon drones also rely on deep learning, just as other autonomous vehicles tech is relying on it. And of course the list goes on.
  7. OK, so hopefully by now you are convinced that Deep Learning is awesome, and the next thing you want to do is use it in your production system. So, how do you actually use a deep learning model in your production environment? Let’s start with the outcome we’re trying to achieve. In fact, it is pretty straight-forward, and is not very different than deploying any other service. We have a trained model, that we want to use for inference, We have a bunch of clients: mobile, desktop, iot, cloud – or any combination of those We want to have a server of sorts, hosting a trained model, exposing an inference API, which when called runs a feed forward through the network doing the deep learning “magic” Naveen explained earlier. That’s a very simple schema of model serving setup.
  8. As we saw in the previous slide, in many ways, serving deep learning models is similar to other, more traditional, serving frameworks out there, such as Apache Tomcat. And indeed in many ways, Model Serving is undifferentiated heavy lifting. That is a term we use and focus on in AWS a lot. What it means is all of the aspects that are necessary to get the job done, but that are not differentiating the business and win against the competition. Setting up servers, networks, etc. is all UHL. Let’s quickly go over the main concerns Model Serving system needs to address: - Performance – this concern is about providing a scalable architecture that is able to meet target TPS, making an efficient use of the available compute resources, strike the right balance between throughput and latency. It is especially important for Deep Learning, since the computational load of running a single inference is typically significant. As a reference, a model such as ResNet-152 requires billions of FLOPs for a single forward pass. Availability – to make your application working properly all the time, you want to minimize down time, and avoid offline status when load is high, or when you are busy deploying a new model. Networking – making your model consumable means you need to expose a network endpoint that clients can call to get predictions. This endpoint needs to support standard interfaces such as HTTP, error codes, security and more. Monitoring – having any service in production means you need the ability to look into your operational metrics in near-real time; things like resource utilization on host, inference latencies, requests and errors. Model Decoupling– when you are serving models you want to offer a way that enables to use trained models without knowing anything about their inner working details. The model may be identifying cats in images, or doing sentiment analysis. No change should be done to the server beyond deploying a different model. Cross Framework – there are many different Neural Network frameworks: MXNet, TensorFlow, PyTorch, Caffe, and more. “Same Same, But Different” - all similar, but different in style and implementation details. We want a model server that just works, regardless of the framework used to build and train the model. Cross Platform – similar to how there are many frameworks, there are also many platforms you can run your server on. From the OS (Linux, Windows) to the actual compute processor which can be a CPU, a GPU or a TPU. And beyond all of that, one uber-concern that is an important meta concern is Ease of Use – all of the concerns just mentioned needs to be addressed in a way that is easy to use, quick to learn, and just work!
  9. Are there systems that handle that for us? The answer is: yes! Deep Learning serving is pretty nascent, but there are a few systems - let’s go over a few: - TF Serving was open sourced Feb 2016, and went 1.0 Aug 2017. It is designed to serve TF models over gRPC, and is used extensively within Google. - Clipper is an ongoing project by RiseLabs at UC Berkely. Open sourced in 2017, currently in v0.2. It is a machine learning serving system with various backend engine support, including Caffe, TF and recently also MXNet - MXNet Model Server, or MMS for short, is actively developed by my team, open sourced Dec 2017, it is built on top of Apache MXNet, which is AWS’s DL framework of choice. MMS is almost at v0.2, in active development, and in this talk we will dive deeper into how it’s designed and some of the exciting engineering challenges we have in front of us as we keep developing the system. Image source: - TF - https://commons.wikimedia.org/wiki/File:Tensorflow_logo.svg Clipper - https://github.com/ucbrise/clipper/blob/develop/images/clipper-logo.png (Apache 2.0 license)
  10. Now that you have seen MMS in action on a simple use case, I’d like to dive into some technical details on how MMS is engineered and used. I'll start with the Model Archive Now let's talk about MMS' network interface. Let's see how MMS uses containers. Metrics. And lastly, I'd like to chat about how we're leveraging ONNX to achieve cross platform support.
  11. To decouple the actual model from the serving framework, we designed the “Model Archive”. Model Archive is a file that encapsulates all of the model-specific logic. It is the one-and-only resource MMS needs in order to set up serving for the model. In many ways, it is similar to Java’s JAR file – and indeed we have took a similar implementation approach. Let’s take a look at what is needed to generate a model archive: a trained neural network, a signature file defining input and output types and shapes, which tells MMS what endpoints to setup, and how to transform the inputs and outputs. Then there’s the option to include custom code, which allows users to add feature extraction logic, or any other init/pre/post processing logic they may want to build into the model. Additionally, users can package whatever other additional files their model will need at runtime. Class labels is an example use case for aux files. Users use the MMS export CLI to package up all of these assets into a Model Archive package, which is then used by MMS to initialize and serve requests as we’ve seen earlier. This decoupling enables a clean separation of responsibilities between model creation and model serving. 1. The ML Engineer or Data Scientist build and trains the model, writes feature extraction code, and then packages it all up into the archive. 2. The Software Engineer or Dev Ops Engineer setup up MMS on a prod cluster, and configures MMS to point to the archive, either on the local FS or on a remote URL. Let’s quickly jump to the console to see how this looks (DEMO) Show a pre-prepared folder with model, signature, code and aux files Open the signature and show Open the code and show Show how the export utility is used
  12. One of our major design decisions when planning MMS was to focus on ease of use, while not introducing any one-way doors that will prevent improvements in the future. With that in mind, we decided to: Expose REST-like endpoints over HTTP - arguably the easiest endpoint to integrate with, which is quite different than TFS's approach which supports only gRPC for performance reasons. All of these endpoints are automatically generated based on the model archive's signature.json JSON is the default encoding format for endpoint - to make it easy for clients to integrate with MMS has an out-of-the-box support for handling binary inputs such as JPEG. With this support, clients can include a JPG image as part of the request payload, and MMS will automatically translate this into an input tensor and resize it for you so it fits the model’s expected input tensor. Support OpenAPI specification - this enables hooking up tooling to automate tasks, such as auto-generating client code across many popular programming languages. Let’s see how this looks – Demo 3 - Curl the api-description endpoint and go over the response
  13. Anyone who ever owned a service in production knows how critical it is to have a reliable and extensive set of operational metrics. You want them reported at a relatively low interval, say every 1 minutes or so, and report operational data that enables the service owner to know important stuff, like errors, traffic, latencies, etc. We took care to design MMS with built in Ops Metrics reporting, so MMS supports out of the box: (1) Requests (2) Latencies (3) Resources We report all metrics across model and hostname dimensions, so users can setup their monitoring and alarming across an entire cluster, or across a specific model, etc. And MMS integrates directly with AWS CloudWatch, so users can use CW’s console and integrations to have full visibility and control over their prod setup.
  14. As I demoed, you can easily run MMS on your Mac. While this will work well for prototyping or testing, it is not a scalable setup for high-load production traffic. For production deployments we recommend using containers: they are lightweight, provides isolation and have wide platform support. The MMS repo includes Docker images that are pre-configured with required software components and configuration for optimal execution. Users can use this image with their container orchestration tool of choice, and there’s plenty of good options out there such as ECS, Docker and Kubernetes. Users can build the pre-configured image MMS provides, push it to a registry, and then orchestrate it with a platform such as ECS. ECS manages the cluster, including scaling, load balancing, networking, instrumenting and more. The MMS image itself includes an NGINX network reverse proxy, integrated with MMS. To learn more about MMS container setup, visit the GitHub repo, where we have details and instructions. We’re also planning to publish a blog post about this specific use case soon!
  15. One of the Model Serving concerns we talked about earlier was Cross Framework, and indeed there are many awesome DL framework to choose from. In an ideal world, you will build your DL model with whatever framework you fancy, and then just deploy it – and it will just work. Think about how the JVM works – as long as the language compiles to ByteCode – it will run regardless of the language you used! A good model server will enable the same flexibility. Another concern we talked about was cross platform support. Intel, Nvidia, Apple’s CoreML…– many different and important runtime platforms. The problem here, as you may have observed, is that to support all of them in a naïve way we need order of N^2 translations/conversions which is pretty hard to build and maintain. This is where ONNX comes into play. ONNX is an initiative driven by AWS, Facebook and Microsoft, with the goal of defining an open neural network and operator definition. You can check it out on onnx.ai. Support includes quite a few frameworks and platforms– and the list is gradually expanding! Model Server will introduce ONNX support in the coming release that is going out next week. With ONNX support, users will be able to package up models built with frameworks that support ONNX, such as PyTorch, Caffe2, CNTK. In the future, we may also leverage ONNX to help MMS run on more platforms, that will add support for ONNX.
  16. OK, without further ado, let’s see how MMS looks in practice. We’ll start with the basic use case: installing MMS, loading a model, serving it, and doing prediction. Ready? Demo: Install MMS Show the model zoo, copy a model link Download an image Use cURL to do inference Examine the results
  17. Thank you for listening, I hope you learned about deep learning systems and serving, and had a good time. MXNet and Model Server are open source - feel free to try it out and file issues. We’re also hiring aggressively, so if you have talented friends that want to be part of the DL revolution - feel free to refer and talk to us! Thank you!