SlideShare uma empresa Scribd logo
1 de 18
Baixar para ler offline
Trenowanie i wdrażanie modeli uczenia
maszynowego z wykorzystaniem GCP
Maciej Pieńkosz
Data Science Summit 2020 1
What we do at Sotrender
2
Our models
1. Sentiment
2. Hatespeech
3. Topic modelling
4. Keyphrase extractor
5. NER (brands and products)
6. Image Tagger
7. Text Extractor
8. Logo Detector
9. Post Classifier
10. ….
3
ML models lifecycle
1. Planning and project setup
2. Data collection and labeling
3. Modeling and exploration
4. Model training and refinement
5. Testing and evaluation
6. Model deployment
7. Ongoing model maintenance and monitoring
4
https://www.jeremyjordan.me/ml-projects-guide/
Modeling with AI Notebooks
1. We use Google Cloud Platform as our cloud provider
2. AI Platform Notebooks is used for initial data exploration and modeling
3. For the start, we favor faster, simpler model architectures that can be easily built,
validated, iterated and eventually deployed (usually on CPU)
4. Experiment tracking: MlFlow
5
https://databricks.com/blog/2018/06/05/introducing-mlflow-an-open-source-machine-learning-platform.html
https://cloud.google.com/ai-platform-notebooks?hl=id
Structuring training code
• Notebooks disadvantages:
– You pay for the whole time the notebook is running
– Code quality is usually lower
– Hard to parametrize, unit test, and review
• After initial experimentation phase, we try to give more structure to
the model training code:
– Refactor codebase to Python packages and modules and move
to git repository (Gitlab)
– Add tests (more on it later)
– Wrap code into a Docker container
– Use dedicated AI Platform Training service to train in the cloud
6
https://www.jeremyjordan.me/ml-projects-guide/
AI Platform Training with custom containers
7
• Advantages:
– Develop locally, train in the cloud
– Pay only for the time of training
– Broad configuration options
– Job statuses and logs for historical runs
are available in the dashboard
– Easy integration with hyperparameter
tuning
Training job dockerfile
Cloud training script
Google Storage for Models and Datasets
• We use Google Storage as primary Store for models
and datasets
• One bucket per model
• We follow unified bucket and directory structure,
same for every model
– Raw data
– Combined datasets, with predefined splits
– Model files
• Documentation in Knowledge Base (Confluence)
• One can use dedicated systems like DVC, Quilt
8
Additional training tips
• Consider having two validation sets: training-dev and
test-dev, to distinguish between overfitting errors and
distribution shift
• Establish human performance for your task
• Evaluate your model performance on important data
slices
• Do hyperparameter tuning; utilize open source packages
e.g. hyperopt
• Develop a systematic way of analyzing model errors
Recommended resources:
• https://www.coursera.org/learn/machine-learning-projects
• https://www.deeplearning.ai/machine-learning-yearning/
9
https://towardsdatascience.com/some-strategies-for-machine-learning-projects-5f2f32c34635
Model deployment
• Your options:
– Online
– Batch (offline)
• Our approach is to deploy models as services
– Easy to integrate
– Easy to use by other teams
• We serve them as REST service with Flask (or, most
recently, FastApi)
• We wrap them in Docker containers so they can be
easily deployed to cloud and serve with Cloud Run
10
https://mlinproduction.com/batch-inference-vs-online-inference/
Online inference
Batch inference
Cloud Deployment: Cloud Run
• We use Cloud Run to deploy our model services
• Cloud Build for delegating build process to GCP
• GCP has dedicated service for serving models, AI
Platform Prediction, but we use Cloud Run
– It is more flexible for us, we can set up any
environment and add any dependencies
– AI Predictions has limits regarding model
size
– We can add additional endpoints (e.g.
/explain to services)
11
Service dockerfile
Cloud deployment script
Cloud Run c.d.
• Useful features out-of-the box
– Autoscaling
– Multiple Revisions (versions), easy Rollback
– Traffic management
– Multiple Namespaces (dev, prod)
– Resource Monitoring
12
Delivery pipeline automation (CI/CD)
13
• Implemented in Gitlab CI/CD
push Download files
Build image
Run tests
Run static analysis
Push image to registry
Code Review
Canary
rollout
deploy
Testing and evaluation
• Unit and integration tests for:
– Input pipelines
– Preprocessing functions
• “Regression” tests for:
– Performance on validation data
– Predictions on some important, hand-picked examples
– Performance on data slices
14
Monitoring
• System level metrics:
– Resource consumption (RAM, CPU), healthchecks, status codes, latency, etc.
• Data level metrics
– Prediction distributions, input data distributions
– System performance against real time labels (collected automatically or manually)
15
https://mlinproduction.com/
Streamlit
• https://www.streamlit.io/
• Easy tool to create simple web Data Products directly in Python
• You can use it to create Demos, share your work, showcase your models behaviour, debug
• Very intuitive, no Web skills required
16
https://towardsdatascience.com/coding-ml-tools-like-you-code-ml-models-ddba3357eace
Demos with Streamlit
17
Thanks for attending!
18

Mais conteúdo relacionado

Mais procurados

Log Data Analysis Platform by Valentin Kropov
Log Data Analysis Platform by Valentin KropovLog Data Analysis Platform by Valentin Kropov
Log Data Analysis Platform by Valentin KropovSoftServe
 
Lessons learned from running Pega in Kubernetes
Lessons learned from running Pega in KubernetesLessons learned from running Pega in Kubernetes
Lessons learned from running Pega in KubernetesCatalin Jora
 
The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...
The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...
The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...Databricks
 
Scala laboratory. Globus. iteration #1
Scala laboratory. Globus. iteration #1Scala laboratory. Globus. iteration #1
Scala laboratory. Globus. iteration #1Vasil Remeniuk
 
From Prototyping to Deployment at Scale with R and sparklyr with Kevin Kuo
From Prototyping to Deployment at Scale with R and sparklyr with Kevin KuoFrom Prototyping to Deployment at Scale with R and sparklyr with Kevin Kuo
From Prototyping to Deployment at Scale with R and sparklyr with Kevin KuoDatabricks
 
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...Databricks
 
Is This Thing On? A Well State Model for the People
Is This Thing On? A Well State Model for the PeopleIs This Thing On? A Well State Model for the People
Is This Thing On? A Well State Model for the PeopleDatabricks
 
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...SQUADEX
 
Scalable Automatic Machine Learning in H2O
 Scalable Automatic Machine Learning in H2O Scalable Automatic Machine Learning in H2O
Scalable Automatic Machine Learning in H2OSri Ambati
 
Intro to AutoML + Hands-on Lab - Erin LeDell, Machine Learning Scientist, H2O.ai
Intro to AutoML + Hands-on Lab - Erin LeDell, Machine Learning Scientist, H2O.aiIntro to AutoML + Hands-on Lab - Erin LeDell, Machine Learning Scientist, H2O.ai
Intro to AutoML + Hands-on Lab - Erin LeDell, Machine Learning Scientist, H2O.aiSri Ambati
 
Gulp and Compass
Gulp and CompassGulp and Compass
Gulp and Compassfatleaf
 

Mais procurados (12)

Log Data Analysis Platform by Valentin Kropov
Log Data Analysis Platform by Valentin KropovLog Data Analysis Platform by Valentin Kropov
Log Data Analysis Platform by Valentin Kropov
 
Lessons learned from running Pega in Kubernetes
Lessons learned from running Pega in KubernetesLessons learned from running Pega in Kubernetes
Lessons learned from running Pega in Kubernetes
 
The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...
The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...
The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...
 
Kubeflow repos
Kubeflow reposKubeflow repos
Kubeflow repos
 
Scala laboratory. Globus. iteration #1
Scala laboratory. Globus. iteration #1Scala laboratory. Globus. iteration #1
Scala laboratory. Globus. iteration #1
 
From Prototyping to Deployment at Scale with R and sparklyr with Kevin Kuo
From Prototyping to Deployment at Scale with R and sparklyr with Kevin KuoFrom Prototyping to Deployment at Scale with R and sparklyr with Kevin Kuo
From Prototyping to Deployment at Scale with R and sparklyr with Kevin Kuo
 
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...
Productionizing H2O Models with Apache Spark with Jakub Hava and Michal Maloh...
 
Is This Thing On? A Well State Model for the People
Is This Thing On? A Well State Model for the PeopleIs This Thing On? A Well State Model for the People
Is This Thing On? A Well State Model for the People
 
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
 
Scalable Automatic Machine Learning in H2O
 Scalable Automatic Machine Learning in H2O Scalable Automatic Machine Learning in H2O
Scalable Automatic Machine Learning in H2O
 
Intro to AutoML + Hands-on Lab - Erin LeDell, Machine Learning Scientist, H2O.ai
Intro to AutoML + Hands-on Lab - Erin LeDell, Machine Learning Scientist, H2O.aiIntro to AutoML + Hands-on Lab - Erin LeDell, Machine Learning Scientist, H2O.ai
Intro to AutoML + Hands-on Lab - Erin LeDell, Machine Learning Scientist, H2O.ai
 
Gulp and Compass
Gulp and CompassGulp and Compass
Gulp and Compass
 

Semelhante a Machine Learning Model Training and Deployment with Google Cloud Platform

Training and deploying ML models with Google Cloud Platform
Training and deploying ML models with Google Cloud PlatformTraining and deploying ML models with Google Cloud Platform
Training and deploying ML models with Google Cloud PlatformSotrender
 
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-usDevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-useltonrodriguez11
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on DatabricksDataScienceConferenc1
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in ProductionDataWorks Summit
 
With Automated ML, is Everyone an ML Engineer?
With Automated ML, is Everyone an ML Engineer?With Automated ML, is Everyone an ML Engineer?
With Automated ML, is Everyone an ML Engineer?Dan Sullivan, Ph.D.
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learningRajesh Muppalla
 
AI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondAI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondProvectus
 
MLflow with Databricks
MLflow with DatabricksMLflow with Databricks
MLflow with DatabricksLiangjun Jiang
 
Mlflow with databricks
Mlflow with databricksMlflow with databricks
Mlflow with databricksLiangjun Jiang
 
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...DataScienceConferenc1
 
A survey on Machine Learning In Production (July 2018)
A survey on Machine Learning In Production (July 2018)A survey on Machine Learning In Production (July 2018)
A survey on Machine Learning In Production (July 2018)Arnab Biswas
 
Legion - AI Runtime Platform
Legion -  AI Runtime PlatformLegion -  AI Runtime Platform
Legion - AI Runtime PlatformAlexey Kharlamov
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated MLMark Tabladillo
 
Magdalena Stenius: MLOPS Will Change Machine Learning
Magdalena Stenius: MLOPS Will Change Machine LearningMagdalena Stenius: MLOPS Will Change Machine Learning
Magdalena Stenius: MLOPS Will Change Machine LearningLviv Startup Club
 
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfSlides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfvitm11
 
Part 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to EndPart 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to EndCloudera, Inc.
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentDatabricks
 

Semelhante a Machine Learning Model Training and Deployment with Google Cloud Platform (20)

Training and deploying ML models with Google Cloud Platform
Training and deploying ML models with Google Cloud PlatformTraining and deploying ML models with Google Cloud Platform
Training and deploying ML models with Google Cloud Platform
 
MLOps in action
MLOps in actionMLOps in action
MLOps in action
 
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-usDevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
 
With Automated ML, is Everyone an ML Engineer?
With Automated ML, is Everyone an ML Engineer?With Automated ML, is Everyone an ML Engineer?
With Automated ML, is Everyone an ML Engineer?
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learning
 
AI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondAI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and Beyond
 
MLflow with Databricks
MLflow with DatabricksMLflow with Databricks
MLflow with Databricks
 
Mlflow with databricks
Mlflow with databricksMlflow with databricks
Mlflow with databricks
 
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...
 
A survey on Machine Learning In Production (July 2018)
A survey on Machine Learning In Production (July 2018)A survey on Machine Learning In Production (July 2018)
A survey on Machine Learning In Production (July 2018)
 
Legion - AI Runtime Platform
Legion -  AI Runtime PlatformLegion -  AI Runtime Platform
Legion - AI Runtime Platform
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated ML
 
Magdalena Stenius: MLOPS Will Change Machine Learning
Magdalena Stenius: MLOPS Will Change Machine LearningMagdalena Stenius: MLOPS Will Change Machine Learning
Magdalena Stenius: MLOPS Will Change Machine Learning
 
Ds for finance day 4
Ds for finance day 4Ds for finance day 4
Ds for finance day 4
 
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfSlides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
 
Part 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to EndPart 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to End
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload Deployment
 

Mais de Sotrender

Topic modeling - nie tylko LDA w Gensim
Topic modeling - nie tylko LDA w GensimTopic modeling - nie tylko LDA w Gensim
Topic modeling - nie tylko LDA w GensimSotrender
 
Budowa modeli uczenia maszynowego zgodnie z regulacjami o ochronie danych za ...
Budowa modeli uczenia maszynowego zgodnie z regulacjami o ochronie danych za ...Budowa modeli uczenia maszynowego zgodnie z regulacjami o ochronie danych za ...
Budowa modeli uczenia maszynowego zgodnie z regulacjami o ochronie danych za ...Sotrender
 
Facebook Audience Insights – czyli czym interesują się polscy użytkownicy Fac...
Facebook Audience Insights – czyli czym interesują się polscy użytkownicy Fac...Facebook Audience Insights – czyli czym interesują się polscy użytkownicy Fac...
Facebook Audience Insights – czyli czym interesują się polscy użytkownicy Fac...Sotrender
 
Human-in-the-loop (HILT) machine learning i augmentacja danych, czyli jak zbu...
Human-in-the-loop (HILT) machine learning i augmentacja danych, czyli jak zbu...Human-in-the-loop (HILT) machine learning i augmentacja danych, czyli jak zbu...
Human-in-the-loop (HILT) machine learning i augmentacja danych, czyli jak zbu...Sotrender
 
Rozpoznawanie treści obrazów na kreacjach reklam na Facebooku z wykorzystanie...
Rozpoznawanie treści obrazów na kreacjach reklam na Facebooku z wykorzystanie...Rozpoznawanie treści obrazów na kreacjach reklam na Facebooku z wykorzystanie...
Rozpoznawanie treści obrazów na kreacjach reklam na Facebooku z wykorzystanie...Sotrender
 
Predykcja efektywności działań marketingowych w serwisie Facebook
Predykcja efektywności działań marketingowych w serwisie FacebookPredykcja efektywności działań marketingowych w serwisie Facebook
Predykcja efektywności działań marketingowych w serwisie FacebookSotrender
 
Wykrywanie mowy nienawiści w języku polskim
Wykrywanie mowy nienawiści w języku polskimWykrywanie mowy nienawiści w języku polskim
Wykrywanie mowy nienawiści w języku polskimSotrender
 
Federated Learning: Budowanie modeli uczenia maszynowego bez wglądu w rozpros...
Federated Learning: Budowanie modeli uczenia maszynowego bez wglądu w rozpros...Federated Learning: Budowanie modeli uczenia maszynowego bez wglądu w rozpros...
Federated Learning: Budowanie modeli uczenia maszynowego bez wglądu w rozpros...Sotrender
 
Prawdziwe oblicze tekstu, czyli jak rozmawiamy w sieci [WDI 2019]
Prawdziwe oblicze tekstu, czyli jak rozmawiamy w sieci [WDI 2019]Prawdziwe oblicze tekstu, czyli jak rozmawiamy w sieci [WDI 2019]
Prawdziwe oblicze tekstu, czyli jak rozmawiamy w sieci [WDI 2019]Sotrender
 
Ślady cyfrowe - sposoby na analizowanie aktywności internautów i działań rekl...
Ślady cyfrowe - sposoby na analizowanie aktywności internautów i działań rekl...Ślady cyfrowe - sposoby na analizowanie aktywności internautów i działań rekl...
Ślady cyfrowe - sposoby na analizowanie aktywności internautów i działań rekl...Sotrender
 
Bajki robotów? Machine Learning in Digital Marketing | Konferencja In Digital...
Bajki robotów? Machine Learning in Digital Marketing | Konferencja In Digital...Bajki robotów? Machine Learning in Digital Marketing | Konferencja In Digital...
Bajki robotów? Machine Learning in Digital Marketing | Konferencja In Digital...Sotrender
 
Sztuczna inteligencja w marketingu | Infoshare 2019
Sztuczna inteligencja w marketingu | Infoshare 2019Sztuczna inteligencja w marketingu | Infoshare 2019
Sztuczna inteligencja w marketingu | Infoshare 2019Sotrender
 
Pragmatic Machine Learning in Business
Pragmatic Machine Learning in BusinessPragmatic Machine Learning in Business
Pragmatic Machine Learning in BusinessSotrender
 
Wykorzystanie Big Data i cyfrowego śladu w naukach psychologicznych i społecz...
Wykorzystanie Big Data i cyfrowego śladu w naukach psychologicznych i społecz...Wykorzystanie Big Data i cyfrowego śladu w naukach psychologicznych i społecz...
Wykorzystanie Big Data i cyfrowego śladu w naukach psychologicznych i społecz...Sotrender
 
Jak wykorzystać social media w badaniach i jak przełożyć to na decyzje związa...
Jak wykorzystać social media w badaniach i jak przełożyć to na decyzje związa...Jak wykorzystać social media w badaniach i jak przełożyć to na decyzje związa...
Jak wykorzystać social media w badaniach i jak przełożyć to na decyzje związa...Sotrender
 
Obsługa klienta w social media
Obsługa klienta w social mediaObsługa klienta w social media
Obsługa klienta w social mediaSotrender
 
Jakimi wartościami kieruje się Twoja grupa docelowa? [Listonic Case Study]
Jakimi wartościami kieruje się Twoja grupa docelowa? [Listonic Case Study]Jakimi wartościami kieruje się Twoja grupa docelowa? [Listonic Case Study]
Jakimi wartościami kieruje się Twoja grupa docelowa? [Listonic Case Study]Sotrender
 
Każde pokolenie ma swój czas? Różnice generacyjne a dane z mediów społecznośc...
Każde pokolenie ma swój czas? Różnice generacyjne a dane z mediów społecznośc...Każde pokolenie ma swój czas? Różnice generacyjne a dane z mediów społecznośc...
Każde pokolenie ma swój czas? Różnice generacyjne a dane z mediów społecznośc...Sotrender
 
Poszerzanie pola walki - czyli z kim tak naprawdę konkurujecie?
Poszerzanie pola walki - czyli z kim tak naprawdę konkurujecie? Poszerzanie pola walki - czyli z kim tak naprawdę konkurujecie?
Poszerzanie pola walki - czyli z kim tak naprawdę konkurujecie? Sotrender
 
Mallkołaj rozdaje prezenty - Case Study z akcji Mall.pl i Los Videos
Mallkołaj rozdaje prezenty - Case Study z akcji Mall.pl i Los VideosMallkołaj rozdaje prezenty - Case Study z akcji Mall.pl i Los Videos
Mallkołaj rozdaje prezenty - Case Study z akcji Mall.pl i Los VideosSotrender
 

Mais de Sotrender (20)

Topic modeling - nie tylko LDA w Gensim
Topic modeling - nie tylko LDA w GensimTopic modeling - nie tylko LDA w Gensim
Topic modeling - nie tylko LDA w Gensim
 
Budowa modeli uczenia maszynowego zgodnie z regulacjami o ochronie danych za ...
Budowa modeli uczenia maszynowego zgodnie z regulacjami o ochronie danych za ...Budowa modeli uczenia maszynowego zgodnie z regulacjami o ochronie danych za ...
Budowa modeli uczenia maszynowego zgodnie z regulacjami o ochronie danych za ...
 
Facebook Audience Insights – czyli czym interesują się polscy użytkownicy Fac...
Facebook Audience Insights – czyli czym interesują się polscy użytkownicy Fac...Facebook Audience Insights – czyli czym interesują się polscy użytkownicy Fac...
Facebook Audience Insights – czyli czym interesują się polscy użytkownicy Fac...
 
Human-in-the-loop (HILT) machine learning i augmentacja danych, czyli jak zbu...
Human-in-the-loop (HILT) machine learning i augmentacja danych, czyli jak zbu...Human-in-the-loop (HILT) machine learning i augmentacja danych, czyli jak zbu...
Human-in-the-loop (HILT) machine learning i augmentacja danych, czyli jak zbu...
 
Rozpoznawanie treści obrazów na kreacjach reklam na Facebooku z wykorzystanie...
Rozpoznawanie treści obrazów na kreacjach reklam na Facebooku z wykorzystanie...Rozpoznawanie treści obrazów na kreacjach reklam na Facebooku z wykorzystanie...
Rozpoznawanie treści obrazów na kreacjach reklam na Facebooku z wykorzystanie...
 
Predykcja efektywności działań marketingowych w serwisie Facebook
Predykcja efektywności działań marketingowych w serwisie FacebookPredykcja efektywności działań marketingowych w serwisie Facebook
Predykcja efektywności działań marketingowych w serwisie Facebook
 
Wykrywanie mowy nienawiści w języku polskim
Wykrywanie mowy nienawiści w języku polskimWykrywanie mowy nienawiści w języku polskim
Wykrywanie mowy nienawiści w języku polskim
 
Federated Learning: Budowanie modeli uczenia maszynowego bez wglądu w rozpros...
Federated Learning: Budowanie modeli uczenia maszynowego bez wglądu w rozpros...Federated Learning: Budowanie modeli uczenia maszynowego bez wglądu w rozpros...
Federated Learning: Budowanie modeli uczenia maszynowego bez wglądu w rozpros...
 
Prawdziwe oblicze tekstu, czyli jak rozmawiamy w sieci [WDI 2019]
Prawdziwe oblicze tekstu, czyli jak rozmawiamy w sieci [WDI 2019]Prawdziwe oblicze tekstu, czyli jak rozmawiamy w sieci [WDI 2019]
Prawdziwe oblicze tekstu, czyli jak rozmawiamy w sieci [WDI 2019]
 
Ślady cyfrowe - sposoby na analizowanie aktywności internautów i działań rekl...
Ślady cyfrowe - sposoby na analizowanie aktywności internautów i działań rekl...Ślady cyfrowe - sposoby na analizowanie aktywności internautów i działań rekl...
Ślady cyfrowe - sposoby na analizowanie aktywności internautów i działań rekl...
 
Bajki robotów? Machine Learning in Digital Marketing | Konferencja In Digital...
Bajki robotów? Machine Learning in Digital Marketing | Konferencja In Digital...Bajki robotów? Machine Learning in Digital Marketing | Konferencja In Digital...
Bajki robotów? Machine Learning in Digital Marketing | Konferencja In Digital...
 
Sztuczna inteligencja w marketingu | Infoshare 2019
Sztuczna inteligencja w marketingu | Infoshare 2019Sztuczna inteligencja w marketingu | Infoshare 2019
Sztuczna inteligencja w marketingu | Infoshare 2019
 
Pragmatic Machine Learning in Business
Pragmatic Machine Learning in BusinessPragmatic Machine Learning in Business
Pragmatic Machine Learning in Business
 
Wykorzystanie Big Data i cyfrowego śladu w naukach psychologicznych i społecz...
Wykorzystanie Big Data i cyfrowego śladu w naukach psychologicznych i społecz...Wykorzystanie Big Data i cyfrowego śladu w naukach psychologicznych i społecz...
Wykorzystanie Big Data i cyfrowego śladu w naukach psychologicznych i społecz...
 
Jak wykorzystać social media w badaniach i jak przełożyć to na decyzje związa...
Jak wykorzystać social media w badaniach i jak przełożyć to na decyzje związa...Jak wykorzystać social media w badaniach i jak przełożyć to na decyzje związa...
Jak wykorzystać social media w badaniach i jak przełożyć to na decyzje związa...
 
Obsługa klienta w social media
Obsługa klienta w social mediaObsługa klienta w social media
Obsługa klienta w social media
 
Jakimi wartościami kieruje się Twoja grupa docelowa? [Listonic Case Study]
Jakimi wartościami kieruje się Twoja grupa docelowa? [Listonic Case Study]Jakimi wartościami kieruje się Twoja grupa docelowa? [Listonic Case Study]
Jakimi wartościami kieruje się Twoja grupa docelowa? [Listonic Case Study]
 
Każde pokolenie ma swój czas? Różnice generacyjne a dane z mediów społecznośc...
Każde pokolenie ma swój czas? Różnice generacyjne a dane z mediów społecznośc...Każde pokolenie ma swój czas? Różnice generacyjne a dane z mediów społecznośc...
Każde pokolenie ma swój czas? Różnice generacyjne a dane z mediów społecznośc...
 
Poszerzanie pola walki - czyli z kim tak naprawdę konkurujecie?
Poszerzanie pola walki - czyli z kim tak naprawdę konkurujecie? Poszerzanie pola walki - czyli z kim tak naprawdę konkurujecie?
Poszerzanie pola walki - czyli z kim tak naprawdę konkurujecie?
 
Mallkołaj rozdaje prezenty - Case Study z akcji Mall.pl i Los Videos
Mallkołaj rozdaje prezenty - Case Study z akcji Mall.pl i Los VideosMallkołaj rozdaje prezenty - Case Study z akcji Mall.pl i Los Videos
Mallkołaj rozdaje prezenty - Case Study z akcji Mall.pl i Los Videos
 

Último

What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfnikeshsingh56
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
knowledge representation in artificial intelligence
knowledge representation in artificial intelligenceknowledge representation in artificial intelligence
knowledge representation in artificial intelligencePriyadharshiniG41
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...boychatmate1
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfrahulyadav957181
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etclalithasri22
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformationAnnie Melnic
 
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfWorld Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfsimulationsindia
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 

Último (20)

What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdf
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
knowledge representation in artificial intelligence
knowledge representation in artificial intelligenceknowledge representation in artificial intelligence
knowledge representation in artificial intelligence
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdf
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etc
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformation
 
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfWorld Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 

Machine Learning Model Training and Deployment with Google Cloud Platform

  • 1. Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem GCP Maciej Pieńkosz Data Science Summit 2020 1
  • 2. What we do at Sotrender 2
  • 3. Our models 1. Sentiment 2. Hatespeech 3. Topic modelling 4. Keyphrase extractor 5. NER (brands and products) 6. Image Tagger 7. Text Extractor 8. Logo Detector 9. Post Classifier 10. …. 3
  • 4. ML models lifecycle 1. Planning and project setup 2. Data collection and labeling 3. Modeling and exploration 4. Model training and refinement 5. Testing and evaluation 6. Model deployment 7. Ongoing model maintenance and monitoring 4 https://www.jeremyjordan.me/ml-projects-guide/
  • 5. Modeling with AI Notebooks 1. We use Google Cloud Platform as our cloud provider 2. AI Platform Notebooks is used for initial data exploration and modeling 3. For the start, we favor faster, simpler model architectures that can be easily built, validated, iterated and eventually deployed (usually on CPU) 4. Experiment tracking: MlFlow 5 https://databricks.com/blog/2018/06/05/introducing-mlflow-an-open-source-machine-learning-platform.html https://cloud.google.com/ai-platform-notebooks?hl=id
  • 6. Structuring training code • Notebooks disadvantages: – You pay for the whole time the notebook is running – Code quality is usually lower – Hard to parametrize, unit test, and review • After initial experimentation phase, we try to give more structure to the model training code: – Refactor codebase to Python packages and modules and move to git repository (Gitlab) – Add tests (more on it later) – Wrap code into a Docker container – Use dedicated AI Platform Training service to train in the cloud 6 https://www.jeremyjordan.me/ml-projects-guide/
  • 7. AI Platform Training with custom containers 7 • Advantages: – Develop locally, train in the cloud – Pay only for the time of training – Broad configuration options – Job statuses and logs for historical runs are available in the dashboard – Easy integration with hyperparameter tuning Training job dockerfile Cloud training script
  • 8. Google Storage for Models and Datasets • We use Google Storage as primary Store for models and datasets • One bucket per model • We follow unified bucket and directory structure, same for every model – Raw data – Combined datasets, with predefined splits – Model files • Documentation in Knowledge Base (Confluence) • One can use dedicated systems like DVC, Quilt 8
  • 9. Additional training tips • Consider having two validation sets: training-dev and test-dev, to distinguish between overfitting errors and distribution shift • Establish human performance for your task • Evaluate your model performance on important data slices • Do hyperparameter tuning; utilize open source packages e.g. hyperopt • Develop a systematic way of analyzing model errors Recommended resources: • https://www.coursera.org/learn/machine-learning-projects • https://www.deeplearning.ai/machine-learning-yearning/ 9 https://towardsdatascience.com/some-strategies-for-machine-learning-projects-5f2f32c34635
  • 10. Model deployment • Your options: – Online – Batch (offline) • Our approach is to deploy models as services – Easy to integrate – Easy to use by other teams • We serve them as REST service with Flask (or, most recently, FastApi) • We wrap them in Docker containers so they can be easily deployed to cloud and serve with Cloud Run 10 https://mlinproduction.com/batch-inference-vs-online-inference/ Online inference Batch inference
  • 11. Cloud Deployment: Cloud Run • We use Cloud Run to deploy our model services • Cloud Build for delegating build process to GCP • GCP has dedicated service for serving models, AI Platform Prediction, but we use Cloud Run – It is more flexible for us, we can set up any environment and add any dependencies – AI Predictions has limits regarding model size – We can add additional endpoints (e.g. /explain to services) 11 Service dockerfile Cloud deployment script
  • 12. Cloud Run c.d. • Useful features out-of-the box – Autoscaling – Multiple Revisions (versions), easy Rollback – Traffic management – Multiple Namespaces (dev, prod) – Resource Monitoring 12
  • 13. Delivery pipeline automation (CI/CD) 13 • Implemented in Gitlab CI/CD push Download files Build image Run tests Run static analysis Push image to registry Code Review Canary rollout deploy
  • 14. Testing and evaluation • Unit and integration tests for: – Input pipelines – Preprocessing functions • “Regression” tests for: – Performance on validation data – Predictions on some important, hand-picked examples – Performance on data slices 14
  • 15. Monitoring • System level metrics: – Resource consumption (RAM, CPU), healthchecks, status codes, latency, etc. • Data level metrics – Prediction distributions, input data distributions – System performance against real time labels (collected automatically or manually) 15 https://mlinproduction.com/
  • 16. Streamlit • https://www.streamlit.io/ • Easy tool to create simple web Data Products directly in Python • You can use it to create Demos, share your work, showcase your models behaviour, debug • Very intuitive, no Web skills required 16 https://towardsdatascience.com/coding-ml-tools-like-you-code-ml-models-ddba3357eace