SlideShare uma empresa Scribd logo
1 de 25
© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
My Nguyen – Solutions Architect – Amazon Web Services Vietnam
AWS’s philosophy on
designing
MLOps platform
Dec 2020
© 2019, Amazon Web Services, Inc. or its Affiliates.
Agenda
• What is MLOps?
• DevOps vs MLOps
• DevOps practices inheritance
• Machine learning development lifecycle
• Unique driving factors to MLOps
• Personas
• Unique challenges faced by ML workload
• MLOps practices on Amazon SageMaker
• Complete separation of steps (and their environments)
• Versioning & tracking
• Pipeline automation
• Continuous improvement
• Demo
• QnA
2
© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
What is MLOps?
Operationalizing machine learning workloads
© 2019, Amazon Web Services, Inc. or its Affiliates.
DevOps vs MLOps 4
© 2019, Amazon Web Services, Inc. or its Affiliates.
Notes: Technology is just a piece of the overall picture 5
© 2019, Amazon Web Services, Inc. or its Affiliates.
DevOps practices inheritance
• Communication & collaboration
• Continuous integration
• Continuous delivery/deployment
• Microservices design
• Infrastructure-as-code & configuration-as-code
• Continuous monitoring & logging
6
© 2019, Amazon Web Services, Inc. or its Affiliates.
Machine learning development lifecycle 7
© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
Unique driving factors to MLOps
© 2019, Amazon Web Services, Inc. or its Affiliates.
Personas
• Business stakeholder
• Data scientist
• Domain expert
• Data engineer
• Security engineer
• Machine learning/DevOps engineer
• Software engineer
All with different skillsets & priorities
9
© 2019, Amazon Web Services, Inc. or its Affiliates.
Unique challenges
• Data:
• The need to utilize production data in development activities
• Dependencies on data pipelines
• Longer experiment lifecycles
• Output of model artifacts:
• Independent lifecycles between model and integrated applications/systems
• Monitoring & tracking of experiments and models
• Unique metrics for performance evaluation
10
© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
MLOps practices on Amazon SageMaker
© 2019, Amazon Web Services, Inc. or its Affiliates.
Complete separation of steps
101011010
010101010
000011110
Data processing Explore
& Build
Train
&Validate
Deploy Monitor
12
© 2019, Amazon Web Services, Inc. or its Affiliates.
Versioning & tracking of every steps 13
© 2019, Amazon Web Services, Inc. or its Affiliates.
Pipeline automation
Metaflow Apache Airflow AWS Step FunctionsKubeflowFlyte
14
© 2019, Amazon Web Services, Inc. or its Affiliates.
SageMaker workflow
The notebook: An entry-point / studio / IDE
Notebook: Explore and Interact
Data Scientists
SageMaker Container
Runtime
Elastic Container
Registry (ECR)
Simple Storage
Service (S3)
15
© 2019, Amazon Web Services, Inc. or its Affiliates.
SageMaker Container
Runtime
Elastic Container
Registry (ECR)
Simple Storage
Service (S3)
SageMaker workflow
Prepare data and script; find or build container image(s)
Notebook: Explore and Interact
Training Data
Custom Code
Training Image
Framework Code
Data Scientists
16
© 2019, Amazon Web Services, Inc. or its Affiliates.
SageMaker Container
Runtime
Elastic Container
Registry (ECR)
Simple Storage
Service (S3)
SageMaker workflow
Run a training job to create a model artifact
Notebook: Explore and Interact
Training Job
Custom
model.tar.gz
Training Data
Custom Code Training Image
Framework CodeFrameworkData
Data Scientists
17
© 2019, Amazon Web Services, Inc. or its Affiliates.
SageMaker Container
Runtime
Elastic Container
Registry (ECR)
Simple Storage
Service (S3)
SageMaker workflow
Deploy the model to a real-time inference endpoint
Notebook: Explore and Interact
Inference Endpoint
Custom
Inference Image
model.tar.gz
Training Data
Framework Code
Training Image
Framework Code
FrameworkModel
Data Scientists
Inference Requests
Custom Code
18
© 2019, Amazon Web Services, Inc. or its Affiliates.
SageMaker Container
Runtime
Elastic Container
Registry (ECR)
Simple Storage
Service (S3)
SageMaker workflow
(…Or run a batch transform job)
Notebook: Explore and Interact
Transform Job
Custom
Inference Image
model.tar.gz Framework Code
Training Image
Framework Code
FrameworkModel
Data Scientists
Input Data
Custom Code
Results
19
© 2019, Amazon Web Services, Inc. or its Affiliates.
SageMaker Container
Runtime
Elastic Container
Registry (ECR)
Simple Storage
Service (S3)
SageMaker workflow
Notebook: Explore and Interact
Training Job
Endpoint /Transformer
Custom
Custom
Inference Image
model.tar.gz
Training Data
Custom Code
Framework Code
Training Image
Framework Code
FrameworkModel
FrameworkData
Data Scientists
Inference Requests
20
© 2019, Amazon Web Services, Inc. or its Affiliates.
Continuous improvement
SageMaker
Hosting
Services
SageMaker
Batch
Transform
SageMaker
Notebooks
SageMaker
Autopilot
SageMaker
Experiments
SageMaker
GroundTruth
SageMaker
Processing
SageMaker
Model
Monitor
Amazon
Augmented
AI
SageMaker
Training
SageMaker
Debugger
SageMaker
Hyperparameter
Tuning
SageMaker Studio, the First Fully Integrated Development
Environment For Machine Learning
21
© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
Demo
Transformation from local notebook to SageMaker workflow
© 2019, Amazon Web Services, Inc. or its Affiliates.
The bigger picture 23
© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
QnA
References:
https://d1.awsstatic.com/whitepapers/architecture/wellarchitected-Machine-Learning-Lens.pdf
https://github.com/aws-samples/aws-stepfunctions-byoc-mlops-using-data-science-sdk
https://github.com/apac-ml-tfc/sagemaker-workshop-101
© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
Thank you!
My Nguyen - https://www.linkedin.com/in/mynguyen6512/

Mais conteúdo relacionado

Mais procurados

Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...
Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...
Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...Amazon Web Services Korea
 
Serving BERT Models in Production with TorchServe
Serving BERT Models in Production with TorchServeServing BERT Models in Production with TorchServe
Serving BERT Models in Production with TorchServeNidhin Pattaniyil
 
MLops workshop AWS
MLops workshop AWSMLops workshop AWS
MLops workshop AWSGili Nachum
 
Seamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflowSeamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflowDatabricks
 
Ml ops past_present_future
Ml ops past_present_futureMl ops past_present_future
Ml ops past_present_futureNisha Talagala
 
MLOps by Sasha Rosenbaum
MLOps by Sasha RosenbaumMLOps by Sasha Rosenbaum
MLOps by Sasha RosenbaumSasha Rosenbaum
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflowDatabricks
 
Ml ops intro session
Ml ops   intro sessionMl ops   intro session
Ml ops intro sessionAvinash Patil
 
Backing Up Amazon EC2 with Amazon EBS Snapshots (CMP301-R1) - AWS re:Invent 2018
Backing Up Amazon EC2 with Amazon EBS Snapshots (CMP301-R1) - AWS re:Invent 2018Backing Up Amazon EC2 with Amazon EBS Snapshots (CMP301-R1) - AWS re:Invent 2018
Backing Up Amazon EC2 with Amazon EBS Snapshots (CMP301-R1) - AWS re:Invent 2018Amazon Web Services
 
Introduction to the Well-Architected Framework and Tool - SVC208 - Anaheim AW...
Introduction to the Well-Architected Framework and Tool - SVC208 - Anaheim AW...Introduction to the Well-Architected Framework and Tool - SVC208 - Anaheim AW...
Introduction to the Well-Architected Framework and Tool - SVC208 - Anaheim AW...Amazon Web Services
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLMLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLJordan Birdsell
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice ArchitectureNguyen Tung
 
From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOpsCarl W. Handlin
 
Deep Dive: AWS CloudHSM (Classic)
Deep Dive: AWS CloudHSM (Classic)Deep Dive: AWS CloudHSM (Classic)
Deep Dive: AWS CloudHSM (Classic)Amazon Web Services
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsMárton Kodok
 
Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...
Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...
Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...Databricks
 
Training And Serving ML Model Using Kubeflow by Jayesh Sharma
Training And Serving ML Model Using Kubeflow by Jayesh SharmaTraining And Serving ML Model Using Kubeflow by Jayesh Sharma
Training And Serving ML Model Using Kubeflow by Jayesh SharmaCodeOps Technologies LLP
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerProvectus
 

Mais procurados (20)

Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...
Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...
Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...
 
Serving BERT Models in Production with TorchServe
Serving BERT Models in Production with TorchServeServing BERT Models in Production with TorchServe
Serving BERT Models in Production with TorchServe
 
MLops workshop AWS
MLops workshop AWSMLops workshop AWS
MLops workshop AWS
 
MLOps in action
MLOps in actionMLOps in action
MLOps in action
 
Seamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflowSeamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflow
 
Ml ops past_present_future
Ml ops past_present_futureMl ops past_present_future
Ml ops past_present_future
 
MLOps by Sasha Rosenbaum
MLOps by Sasha RosenbaumMLOps by Sasha Rosenbaum
MLOps by Sasha Rosenbaum
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
 
Ml ops intro session
Ml ops   intro sessionMl ops   intro session
Ml ops intro session
 
Backing Up Amazon EC2 with Amazon EBS Snapshots (CMP301-R1) - AWS re:Invent 2018
Backing Up Amazon EC2 with Amazon EBS Snapshots (CMP301-R1) - AWS re:Invent 2018Backing Up Amazon EC2 with Amazon EBS Snapshots (CMP301-R1) - AWS re:Invent 2018
Backing Up Amazon EC2 with Amazon EBS Snapshots (CMP301-R1) - AWS re:Invent 2018
 
Introduction to the Well-Architected Framework and Tool - SVC208 - Anaheim AW...
Introduction to the Well-Architected Framework and Tool - SVC208 - Anaheim AW...Introduction to the Well-Architected Framework and Tool - SVC208 - Anaheim AW...
Introduction to the Well-Architected Framework and Tool - SVC208 - Anaheim AW...
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLMLOps - The Assembly Line of ML
MLOps - The Assembly Line of ML
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice Architecture
 
From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOps
 
Deep Dive: AWS CloudHSM (Classic)
Deep Dive: AWS CloudHSM (Classic)Deep Dive: AWS CloudHSM (Classic)
Deep Dive: AWS CloudHSM (Classic)
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflows
 
Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...
Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...
Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...
 
Training And Serving ML Model Using Kubeflow by Jayesh Sharma
Training And Serving ML Model Using Kubeflow by Jayesh SharmaTraining And Serving ML Model Using Kubeflow by Jayesh Sharma
Training And Serving ML Model Using Kubeflow by Jayesh Sharma
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
 

Semelhante a Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform

MLOPS By Amazon offered and free download
MLOPS By Amazon offered and free downloadMLOPS By Amazon offered and free download
MLOPS By Amazon offered and free downloadpouyan533
 
AWS DevDay Cologne - CI/CD for modern applications
AWS DevDay Cologne - CI/CD for modern applicationsAWS DevDay Cologne - CI/CD for modern applications
AWS DevDay Cologne - CI/CD for modern applicationsCobus Bernard
 
Become a Machine Learning Developer with AWS Services
Become a Machine Learning Developer with AWS ServicesBecome a Machine Learning Developer with AWS Services
Become a Machine Learning Developer with AWS ServicesAmazon Web Services
 
Become a Machine Learning developer with AWS (Avril 2019)
Become a Machine Learning developer with AWS (Avril 2019)Become a Machine Learning developer with AWS (Avril 2019)
Become a Machine Learning developer with AWS (Avril 2019)Julien SIMON
 
Amazon SageMaker workshop
Amazon SageMaker workshopAmazon SageMaker workshop
Amazon SageMaker workshopJulien SIMON
 
WhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter BotWhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter BotRandall Hunt
 
Train & Deploy ML Models with Amazon Sagemaker: Collision 2018
Train & Deploy ML Models with Amazon Sagemaker: Collision 2018Train & Deploy ML Models with Amazon Sagemaker: Collision 2018
Train & Deploy ML Models with Amazon Sagemaker: Collision 2018Amazon Web Services
 
Integrate Machine Learning into Your Spring Application in Less than an Hour
Integrate Machine Learning into Your Spring Application in Less than an HourIntegrate Machine Learning into Your Spring Application in Less than an Hour
Integrate Machine Learning into Your Spring Application in Less than an HourVMware Tanzu
 
Modern Applications Development on AWS
Modern Applications Development on AWSModern Applications Development on AWS
Modern Applications Development on AWSBoaz Ziniman
 
Supercharge your Machine Learning Solutions with Amazon SageMaker
Supercharge your Machine Learning Solutions with Amazon SageMakerSupercharge your Machine Learning Solutions with Amazon SageMaker
Supercharge your Machine Learning Solutions with Amazon SageMakerAmazon Web Services
 
Build Modern Applications that Align with Twelve-Factor Methods (API303) - AW...
Build Modern Applications that Align with Twelve-Factor Methods (API303) - AW...Build Modern Applications that Align with Twelve-Factor Methods (API303) - AW...
Build Modern Applications that Align with Twelve-Factor Methods (API303) - AW...Amazon Web Services
 
Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)Julien SIMON
 
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...Jonathan Dion
 
Serverless AI with Scikit-Learn (GPSWS405) - AWS re:Invent 2018
Serverless AI with Scikit-Learn (GPSWS405) - AWS re:Invent 2018Serverless AI with Scikit-Learn (GPSWS405) - AWS re:Invent 2018
Serverless AI with Scikit-Learn (GPSWS405) - AWS re:Invent 2018Amazon Web Services
 
CICDforModernApplications-Oslo.pdf
CICDforModernApplications-Oslo.pdfCICDforModernApplications-Oslo.pdf
CICDforModernApplications-Oslo.pdfAmazon Web Services
 
Mainframe Modernization with AWS: Patterns and Best Practices
Mainframe Modernization with AWS: Patterns and Best PracticesMainframe Modernization with AWS: Patterns and Best Practices
Mainframe Modernization with AWS: Patterns and Best PracticesAmazon Web Services
 
Driving Innovation with Serverless Applications (GPSBUS212) - AWS re:Invent 2018
Driving Innovation with Serverless Applications (GPSBUS212) - AWS re:Invent 2018Driving Innovation with Serverless Applications (GPSBUS212) - AWS re:Invent 2018
Driving Innovation with Serverless Applications (GPSBUS212) - AWS re:Invent 2018Amazon Web Services
 
MLOps with serverless architectures (October 2018)
MLOps with serverless architectures (October 2018)MLOps with serverless architectures (October 2018)
MLOps with serverless architectures (October 2018)Julien SIMON
 
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...Amazon Web Services Korea
 

Semelhante a Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform (20)

MLOPS By Amazon offered and free download
MLOPS By Amazon offered and free downloadMLOPS By Amazon offered and free download
MLOPS By Amazon offered and free download
 
AWS DevDay Cologne - CI/CD for modern applications
AWS DevDay Cologne - CI/CD for modern applicationsAWS DevDay Cologne - CI/CD for modern applications
AWS DevDay Cologne - CI/CD for modern applications
 
Become a Machine Learning Developer with AWS Services
Become a Machine Learning Developer with AWS ServicesBecome a Machine Learning Developer with AWS Services
Become a Machine Learning Developer with AWS Services
 
Become a Machine Learning developer with AWS (Avril 2019)
Become a Machine Learning developer with AWS (Avril 2019)Become a Machine Learning developer with AWS (Avril 2019)
Become a Machine Learning developer with AWS (Avril 2019)
 
Amazon SageMaker workshop
Amazon SageMaker workshopAmazon SageMaker workshop
Amazon SageMaker workshop
 
WhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter BotWhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter Bot
 
Train & Deploy ML Models with Amazon Sagemaker: Collision 2018
Train & Deploy ML Models with Amazon Sagemaker: Collision 2018Train & Deploy ML Models with Amazon Sagemaker: Collision 2018
Train & Deploy ML Models with Amazon Sagemaker: Collision 2018
 
Integrate Machine Learning into Your Spring Application in Less than an Hour
Integrate Machine Learning into Your Spring Application in Less than an HourIntegrate Machine Learning into Your Spring Application in Less than an Hour
Integrate Machine Learning into Your Spring Application in Less than an Hour
 
Modern Applications Development on AWS
Modern Applications Development on AWSModern Applications Development on AWS
Modern Applications Development on AWS
 
Supercharge your Machine Learning Solutions with Amazon SageMaker
Supercharge your Machine Learning Solutions with Amazon SageMakerSupercharge your Machine Learning Solutions with Amazon SageMaker
Supercharge your Machine Learning Solutions with Amazon SageMaker
 
Build Modern Applications that Align with Twelve-Factor Methods (API303) - AW...
Build Modern Applications that Align with Twelve-Factor Methods (API303) - AW...Build Modern Applications that Align with Twelve-Factor Methods (API303) - AW...
Build Modern Applications that Align with Twelve-Factor Methods (API303) - AW...
 
Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)
 
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...
 
Serverless AI with Scikit-Learn (GPSWS405) - AWS re:Invent 2018
Serverless AI with Scikit-Learn (GPSWS405) - AWS re:Invent 2018Serverless AI with Scikit-Learn (GPSWS405) - AWS re:Invent 2018
Serverless AI with Scikit-Learn (GPSWS405) - AWS re:Invent 2018
 
CICDforModernApplications-Oslo.pdf
CICDforModernApplications-Oslo.pdfCICDforModernApplications-Oslo.pdf
CICDforModernApplications-Oslo.pdf
 
Mainframe Modernization with AWS: Patterns and Best Practices
Mainframe Modernization with AWS: Patterns and Best PracticesMainframe Modernization with AWS: Patterns and Best Practices
Mainframe Modernization with AWS: Patterns and Best Practices
 
Driving Innovation with Serverless Applications (GPSBUS212) - AWS re:Invent 2018
Driving Innovation with Serverless Applications (GPSBUS212) - AWS re:Invent 2018Driving Innovation with Serverless Applications (GPSBUS212) - AWS re:Invent 2018
Driving Innovation with Serverless Applications (GPSBUS212) - AWS re:Invent 2018
 
CI/CD for Modern Applications
CI/CD for Modern ApplicationsCI/CD for Modern Applications
CI/CD for Modern Applications
 
MLOps with serverless architectures (October 2018)
MLOps with serverless architectures (October 2018)MLOps with serverless architectures (October 2018)
MLOps with serverless architectures (October 2018)
 
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...
 

Mais de Grokking VN

Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking VN
 
Grokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking VN
 
Grokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking VN
 
Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Grokking Techtalk #40: Consistency and Availability tradeoff in database clusterGrokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Grokking Techtalk #40: Consistency and Availability tradeoff in database clusterGrokking VN
 
Grokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking VN
 
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 Grokking Techtalk #39: How to build an event driven architecture with Kafka ... Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...Grokking VN
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compilerGrokking VN
 
Grokking Techtalk #37: Data intensive problem
 Grokking Techtalk #37: Data intensive problem Grokking Techtalk #37: Data intensive problem
Grokking Techtalk #37: Data intensive problemGrokking VN
 
Grokking Techtalk #37: Software design and refactoring
 Grokking Techtalk #37: Software design and refactoring Grokking Techtalk #37: Software design and refactoring
Grokking Techtalk #37: Software design and refactoringGrokking VN
 
Grokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking VN
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer... Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...Grokking VN
 
Grokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKIGrokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKIGrokking VN
 
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking VN
 
SOLID & Design Patterns
SOLID & Design PatternsSOLID & Design Patterns
SOLID & Design PatternsGrokking VN
 
Grokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking VN
 
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking VN
 
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking VN
 
Grokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking VN
 
Grokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking VN
 
Grokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platformGrokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platformGrokking VN
 

Mais de Grokking VN (20)

Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
 
Grokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles Thinking
 
Grokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystified
 
Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Grokking Techtalk #40: Consistency and Availability tradeoff in database clusterGrokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster
 
Grokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applications
 
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 Grokking Techtalk #39: How to build an event driven architecture with Kafka ... Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compiler
 
Grokking Techtalk #37: Data intensive problem
 Grokking Techtalk #37: Data intensive problem Grokking Techtalk #37: Data intensive problem
Grokking Techtalk #37: Data intensive problem
 
Grokking Techtalk #37: Software design and refactoring
 Grokking Techtalk #37: Software design and refactoring Grokking Techtalk #37: Software design and refactoring
Grokking Techtalk #37: Software design and refactoring
 
Grokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellchecking
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer... Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 
Grokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKIGrokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKI
 
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
 
SOLID & Design Patterns
SOLID & Design PatternsSOLID & Design Patterns
SOLID & Design Patterns
 
Grokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous Communications
 
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
 
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
 
Grokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search Tree
 
Grokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the Magic
 
Grokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platformGrokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platform
 

Último

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Último (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform

  • 1. © 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates. My Nguyen – Solutions Architect – Amazon Web Services Vietnam AWS’s philosophy on designing MLOps platform Dec 2020
  • 2. © 2019, Amazon Web Services, Inc. or its Affiliates. Agenda • What is MLOps? • DevOps vs MLOps • DevOps practices inheritance • Machine learning development lifecycle • Unique driving factors to MLOps • Personas • Unique challenges faced by ML workload • MLOps practices on Amazon SageMaker • Complete separation of steps (and their environments) • Versioning & tracking • Pipeline automation • Continuous improvement • Demo • QnA 2
  • 3. © 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates. What is MLOps? Operationalizing machine learning workloads
  • 4. © 2019, Amazon Web Services, Inc. or its Affiliates. DevOps vs MLOps 4
  • 5. © 2019, Amazon Web Services, Inc. or its Affiliates. Notes: Technology is just a piece of the overall picture 5
  • 6. © 2019, Amazon Web Services, Inc. or its Affiliates. DevOps practices inheritance • Communication & collaboration • Continuous integration • Continuous delivery/deployment • Microservices design • Infrastructure-as-code & configuration-as-code • Continuous monitoring & logging 6
  • 7. © 2019, Amazon Web Services, Inc. or its Affiliates. Machine learning development lifecycle 7
  • 8. © 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates. Unique driving factors to MLOps
  • 9. © 2019, Amazon Web Services, Inc. or its Affiliates. Personas • Business stakeholder • Data scientist • Domain expert • Data engineer • Security engineer • Machine learning/DevOps engineer • Software engineer All with different skillsets & priorities 9
  • 10. © 2019, Amazon Web Services, Inc. or its Affiliates. Unique challenges • Data: • The need to utilize production data in development activities • Dependencies on data pipelines • Longer experiment lifecycles • Output of model artifacts: • Independent lifecycles between model and integrated applications/systems • Monitoring & tracking of experiments and models • Unique metrics for performance evaluation 10
  • 11. © 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates. MLOps practices on Amazon SageMaker
  • 12. © 2019, Amazon Web Services, Inc. or its Affiliates. Complete separation of steps 101011010 010101010 000011110 Data processing Explore & Build Train &Validate Deploy Monitor 12
  • 13. © 2019, Amazon Web Services, Inc. or its Affiliates. Versioning & tracking of every steps 13
  • 14. © 2019, Amazon Web Services, Inc. or its Affiliates. Pipeline automation Metaflow Apache Airflow AWS Step FunctionsKubeflowFlyte 14
  • 15. © 2019, Amazon Web Services, Inc. or its Affiliates. SageMaker workflow The notebook: An entry-point / studio / IDE Notebook: Explore and Interact Data Scientists SageMaker Container Runtime Elastic Container Registry (ECR) Simple Storage Service (S3) 15
  • 16. © 2019, Amazon Web Services, Inc. or its Affiliates. SageMaker Container Runtime Elastic Container Registry (ECR) Simple Storage Service (S3) SageMaker workflow Prepare data and script; find or build container image(s) Notebook: Explore and Interact Training Data Custom Code Training Image Framework Code Data Scientists 16
  • 17. © 2019, Amazon Web Services, Inc. or its Affiliates. SageMaker Container Runtime Elastic Container Registry (ECR) Simple Storage Service (S3) SageMaker workflow Run a training job to create a model artifact Notebook: Explore and Interact Training Job Custom model.tar.gz Training Data Custom Code Training Image Framework CodeFrameworkData Data Scientists 17
  • 18. © 2019, Amazon Web Services, Inc. or its Affiliates. SageMaker Container Runtime Elastic Container Registry (ECR) Simple Storage Service (S3) SageMaker workflow Deploy the model to a real-time inference endpoint Notebook: Explore and Interact Inference Endpoint Custom Inference Image model.tar.gz Training Data Framework Code Training Image Framework Code FrameworkModel Data Scientists Inference Requests Custom Code 18
  • 19. © 2019, Amazon Web Services, Inc. or its Affiliates. SageMaker Container Runtime Elastic Container Registry (ECR) Simple Storage Service (S3) SageMaker workflow (…Or run a batch transform job) Notebook: Explore and Interact Transform Job Custom Inference Image model.tar.gz Framework Code Training Image Framework Code FrameworkModel Data Scientists Input Data Custom Code Results 19
  • 20. © 2019, Amazon Web Services, Inc. or its Affiliates. SageMaker Container Runtime Elastic Container Registry (ECR) Simple Storage Service (S3) SageMaker workflow Notebook: Explore and Interact Training Job Endpoint /Transformer Custom Custom Inference Image model.tar.gz Training Data Custom Code Framework Code Training Image Framework Code FrameworkModel FrameworkData Data Scientists Inference Requests 20
  • 21. © 2019, Amazon Web Services, Inc. or its Affiliates. Continuous improvement SageMaker Hosting Services SageMaker Batch Transform SageMaker Notebooks SageMaker Autopilot SageMaker Experiments SageMaker GroundTruth SageMaker Processing SageMaker Model Monitor Amazon Augmented AI SageMaker Training SageMaker Debugger SageMaker Hyperparameter Tuning SageMaker Studio, the First Fully Integrated Development Environment For Machine Learning 21
  • 22. © 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates. Demo Transformation from local notebook to SageMaker workflow
  • 23. © 2019, Amazon Web Services, Inc. or its Affiliates. The bigger picture 23
  • 24. © 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates. QnA References: https://d1.awsstatic.com/whitepapers/architecture/wellarchitected-Machine-Learning-Lens.pdf https://github.com/aws-samples/aws-stepfunctions-byoc-mlops-using-data-science-sdk https://github.com/apac-ml-tfc/sagemaker-workshop-101
  • 25. © 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates. Thank you! My Nguyen - https://www.linkedin.com/in/mynguyen6512/

Notas do Editor

  1. Build trên nền Non trẻ hơn
  2. Also pipeline-as-code & policy-as-code
  3. Different skillset & priorities
  4. Also pipeline-as-code & policy-as-code
  5. Code versioning controls Shared environments, IDE – Jupyter Note/Lab Infrastructure as code Self-service environment SaaS
  6. Most importantly: training & processing Separation of source, environments, etc. Security Experiment lifecycles Pricing Efficiency
  7. Reproduceability is hard End-to-end tracability Dashboard ->
  8. Netflix built metaflow Lyft build Flyte Kubeflow Apache Airflow Important factor: skill set & enforce Metaflow Netflix built metaflow Netflix is a huge customer of AWS In production since 2018 Made open source by Netflix & AWS in 2019 What is it? Basic concepts of metaflow Deploying to AWS is easy Flyte A K8s native distributed workflow orchestrator used at Lyft for: Data science Pricing Fraud detection Locations ETA and more Enables highly concurrent, scalable workflows for ML and data processing Core concepts of Flyte – task, DAG, workflows, control flow specification. Actual task can be in any language – tasks executed as containers. Provisions necessary resources dynamically, executes tasks as docker containers, and de-provisions resources when tasks are complete to control costs. Supports execution across 100s of machines e.g. production model training Kubeflow, Airflow are fairly popular Airflow Amazon SageMaker with Apache Airflow 1.10.1. If you use Airflow, you can use SageMaker Workflow in Apache Airflow More details from https://sagemaker.readthedocs.io/en/stable/using_workflow.html Many customers want to use the fully managed capabilities of Amazon SageMaker for machine learning, but also want platform and infrastructure teams to continue using Kubernetes for orchestration and managing pipelines. SageMaker addresses this requirement by letting Kubernetes users train and deploy models in SageMaker using SageMaker-Kubeflow operations and pipelines. With operators and pipelines, Kubernetes users can access fully managed SageMaker ML tools and engines, natively from Kubeflow. This eliminates the need to manually manage and optimize ML infrastructure in Kubernetes while still preserving control of overall orchestration through Kubernetes. Using SageMaker operators and pipelines for Kubernetes, you can get the benefits of a fully managed service for machine learning in Kubernetes, without migrating workloads. If you use Kubernetes, you can use SageMaker Operators for Kubernetes You can install the Sagemaker Operator for Kubernetes using the provided Helm Chart Once you have this operator installed, K8s users can natively invoke SageMaker features like model training, Hyperparameter Tuning and Batch Transform jobs They can also setup model serving using SageMaker Model Hosting Services https://sagemaker.readthedocs.io/en/stable/amazon_sagemaker_operators_for_kubernetes.html#what-is-an-operator https://eksworkshop.com/advanced/420_kubeflow/pipelines/ We see customers build serverless ML workflows using AWS Step Functions Open source - Step Functions Data Science SDK for SageMaker Create workflows to pre-process data, train/deploy models using SageMaker Data pre-processing can be done using AWS Glue SageMaker functionality like model training, HPO and end point creation is accessible Use the SDK to create and visualize the workflows Scale workflows without having to worry about infrastructure https://aws.amazon.com/about-aws/whats-new/2019/11/introducing-aws-step-functions-data-science-sdk-amazon-sagemaker/ Many good tools exist. You can run any of the tools we saw earlier on AWS. Remember - Tools are meant to make your life easier Don’t get fixated on the tools. Work backwards from the problem you are trying to solve. So think about your existing s/w engg workflows and tools Ask yourself, which tools will best augment what you already have Ask yourself, which tools are your people most comfortable with AWS approach is use the tools that work for you
  9. Easy to think of SageMaker as Notebook. The key thing to remember is that the notebook UI we see a lot in the demos is just a part of the SageMaker platform – and an optional part at that! The notebook is the front-end environment in which we’ll experiment with our data and code. Keep that instance low-cost resource. Value of separation… When we’re ready to try and train or deploy a model, we’ll be spinning up separate, dedicated infrastructure in the SageMaker container runtime – which means we have lots of flexibility to choose resources cost-effectively and only pay for what we need. All managed The orchestration that SageMaker gives us to make this happen is closely integrated to these other two services: The images defining our containers will need to be stored in Amazon ECR (there’s not currently an integration for external registries like DockerHub – but if you have a particular technology in mind our service team would appreciate the feedback! …And the preferred storage platform for not just our input data but also model artifacts and other stuff generated in the workflow will be Amazon S3. Why? <The generic S3 pitch – it’s got everything you need for a data lake> Most integrated service, arguably most mature, tiers, security models, high durability Recaping: 4 things …So let’s look at how that end-to-end process works.
  10. To start with I have: The data that I want to train on (prepared and loaded to S3) – pre-processed already, in Notebook, but also option for other services like Glue or Processing Jobs to … The training script I’d like to run (e.g. defining neural network shape and fitting routine – on the notebook instance where I’m working) minimum code One of the pre-prepared SageMaker framework container images somewhere in Amazon ECR – maybe TensorFlow, PyTorch, or MXNet repeatable, controlled, re-producable
  11. So what’s happening when we start a training job by calling “estimator.fit()” in those examples from before? We’re gonna start seeing a lot of arrows here, so the cool thing to remember is that all of the arrows are things *SageMaker is doing for you* - not things you need to do yourself! First, assuming you provide a custom code script (or folder of code), the SageMaker SDK is going to zip that up and upload it to a new location in S3. So you can’t forget to check your working version in to git, and you won’t lose track of that version that worked well in the middle of your experiments: The results are going to be traceable to the code that created them. Next, SageMaker is going to spin up whatever infrastructure you asked for in the fit() request, and pull down the docker image to run on it SageMaker will also start downloading your source data from S3 into the container – no messing about with S3 API calls in your script – your code can read it from folder, just as if you were running locally. Env params… As the container fires up, that framework application does a load of helpful prep but one particularly important thing: It installs any additional inline dependencies specified for your custom code, then starts it up and passes in the parameters of the training job. Your code runs, prints status to the console, and saves the trained model to disk just like you normally would… But SageMaker takes care of zipping and uploading that final model to S3 – and also other output mechanisms like sending the logs to CloudWatch and collecting metrics. Pay only for … So the benefit we’ve gained here is that our custom code can be quite simple: Load a CSV from file, make a random forest, save it to file, etc. We can even add specify additional dependencies via a requirements.txt file… and SageMaker plus the framework container will orchestrate these overhead tasks to give us this nice lineage-traceable workflow with all of the cool features we talked about earlier – with no extra code complexity required on our part.
  12. When it’s time to deploy that model to an inference endpoint, we simply reference: Our model artifact tarball from S3 An inference container (which might be the same one as for training, or might be a different image because the dependencies could be differently optimized for run-time) And maybe some custom code again: This time just defining some helper functions that we might want to customize from the built-in inference flow, such as how to de/serialize requests and responses, or how the model file(s) need to be loaded from disk into memory if the process is different from standard. How it’s optimized As in training, SageMaker will handle the creation of infrastructure and loading of these components for us. If we used the ‘estimator’ pattern from the high-level SageMaker SDK, all we need to call is a single estimator.deploy(…) function to make it happen. Again here the intent is that any custom code needed can be small: Just providing a few optional functions for serialization, model loading, etc… Rather than writing and having to maintain a model server, integrations with TorchServe or TensorFlow Serving, etc. Custom input format (JSON)…
  13. Not today, but… In SageMaker, batch transform jobs function pretty much identically to real time inference endpoints from a user code point of view: The batch transform engine handles reading your source data from S3, feeding it through your model, storing the results back to S3, and shutting down the resources again as soon as the job is done. Pay only for…
  14. Mechanism: how easiest for different personas? Skillset dependency – learning curve …So that’s our overview picture for framework containers: You write pretty minimal code just as you usually would for experimenting in your notebook. But instead of running that code locally, which can make things like infrastructure optimization, experiment tracking, and inference deployment tricky… SageMaker provides some nice streamlined, high-level APIs to trigger containerized training and inference jobs (or deploy endpoints) on separate infrastructure. At the fundamental level, the system is super flexible because you can make fully custom container images and model artifact tarballs… But the framework container images together with the SageMaker SDK library (for your notebook) enable this higher-level, container-plus-custom-code workflow. Same as the morning, just diff drawing Solve problems on experimenting, tracking, etc.
  15. Also lession learnt & best practices
  16. The Repeatable stage is generally focused on applying automation as the number of machine learning workloads running in production increases. In general, at this stage many of the activities in building, training and deploying machine learning models is automated. The introduction of automation reduces manual hand-offs between teams and reduces the operational overhead of previously manual/ad-hoc tasks. The ability to orchestrate machine learning workflows into automated machine learning also depends on having a data strategy and automated data processing tasks. Queue Management: Ability to manage, schedule, and prioritize tasks Resource Management: Access to horizontally scalable compute that can scale based on workflow task requirements Workflow Operators: Error handling, retry and conditional logic functions Workflow Logs: Centralized logs and configuration parameters for execution and task level logs The Reliable stage builds on the automation from the Repeatable stage but aims to ensure automation is balanced with practices aimed to increase quality, enable end-to-end traceability, increase reliability through automatic rollbacks, increase visibility into development and operational health, and ensure repeatability. In general, at this stage MLOps practices of Infrastructure-as-Code/Configuration-as-Code, Continuous Integration, Continuous Delivery/Deployment, and Continuous Monitoring are introduced.