Unleashing the Power of Machine Learning with Azure AutoML

•

1 gostou•86 visualizações

This session will show how to quickly implement a Machine Learning model using Azure Automated ML and the Python SDK. In addition, the new toolkits developed by Microsoft that allow to easily evaluate both the performance of the prototyped model and to explain its behavior to executives and stakeholders will be shown during the demo. (https://datasaturdays.com/events/datasaturday0001.html)

Tecnologia

#datasatpn
February 27th, 2021
Data Saturday #1
Unleashing the Power of Machine Learning
Prototyping using Azure AutoML and Python
Luca Zavarella

Who Am I
Luca Zavarella
Working with SQL Server since 2007 (BI)
Microsoft MVP for Artificial Intelligence
Microsoft Certified: Azure Data Scientist Associate
DAMAG Founder, ODSC Ambassador
AI & ML Practice Director @
Email: lzavarella@lucient.com
Twitter: @lucazav
LinkedIn: http://it.linkedin.com/in/lucazavarella
Blog: medium.com/@lucazav

Agenda
• ML Prototyping
• Machine Learning Process
• Azure AutoML
• Overview
• Validation Types
• Algorithms
• Featurization
• Data Guardrails
• Demo
• Conclusion

Steps To Build a Machine Learning Solution
1
Problem
Framing
2
Get/Prepare
Data
3
Develop
Model
4
Deploy
Model
5
Evaluate /
Track
Performance 3.1
Analysis/
Metric
definition
3.2
Feature
Engineering
3.3
Model
Training
3.4
Parameter
Tuning
3.5
Evaluation

What a Machine Learning Project Really Is
A Machine Learning project can be viewed as…
…a research and development activity
…later transformed into a Data Engineering project.
Icons by Vectors Market from the Noun Project
In short, a complete ML project could take months to implement!

Quoting a ML Project
Data Exploration
You…
Feature Selection
Customer
Could you quote
for a ML Proof
Of Concept?

The Suggested POC Process
You
Customer
STEP 1
1. Define target
2. Understand what data
is available
3. Define the schema of
the input ML dataset
2-3 days
STEP 2
1. Collect all the data
2. Clean the data
3. Provide data with the
defined schema
X days
STEP 3
1. All the ML magic
stuff!
2-3 days
STEP 4
1. Documentation
2. Presentation of
results
1-2 days
Fast tool for
prototyping

Validation Types: Train-Validation Split

Validation Types: Auto
• For datasets larger than 20,000 rows, the 10% of the initial training data
is taken as the validation set. In turn, that validation set is used for
metrics calculation.
• For datasets smaller than 20,000 rows, the cross-validation approach is
applied: 10 folds will be used if the dataset is less than 1000 rows; 3
folds otherwise.

Validation Types: Rolling Origin Cross-Validation
Only for time-series forecasting

Ensembling Models: Voting
Hard Voting. Predict the class with the
largest sum of votes from models
Soft Voting. Predict the class with the
largest summed probability from models.
Classification
The prediction that is the average
of the prediction of base regressors
Regression

Featurization: Scaling and Normalization

Data Guardrails in the UI
Highly imbalanced: ratio of the
samples in the least populated
class to the samples in the most
populated class is less than 20%

Azure AutoML Strengths
• A Python-based technology
• Easy integration with your custom pipelines
• Data normalizations and basic transformations included
• Complex featurization included
• Also NLP feature engineering
• Ensemble models out of the box
• Automatic model explanations

Future of Azure AutoML
• Only basic imputers implemented
• Stratified cross-validation not implemented out of the box
• Highly imbalanced datasets not automatically fixed
• Explicit feature selection step missing
• Neural Networks still not included in training algorithms
• Except for ForecastTCN for time-series forecasting

References
• Probabilistic Matrix Factorization for Automated Machine Learning
(https://arxiv.org/abs/1705.05355)
• What is automated machine learning (AutoML)?
(https://docs.microsoft.com/en-us/azure/machine-learning/concept-
automated-ml)
• A Review of Azure Automated Machine Learning (AutoML)
(https://medium.com/microsoftazure/a-review-of-azure-automated-
machine-learning-automl-5d2f98512406 )

Mais conteúdo relacionado

Mais procurados

Introduction to Azure machine learningJasjit Chopra

Automatic machine learning (AutoML) 101QuantUniversity

201906 01 Introduction to ML.NET 1.0Mark Tabladillo

Azure Machine LearningDmitry Petukhov

16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...Athens Big Data

Cerrera DINWC2015Dmitry Kalashnikov

Azure Machine Learning 101Andrew Badera

AI with Azure Machine LearningGeert Baeke

Insider's introduction to microsoft azure machine learning: 201411 Seattle Bu...Mark Tabladillo

Alex mang patterns for scalability in microsoft azure applicationCodecamp Romania

Azure Machine Learning 101Renato Jovic

Automate your Machine LearningAjit Ananthram

Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...Simplilearn

Machine learning systems for engineersCameron Joannidis

Production machine learning_infrastructurejoshwills

Productionzing ML Model Using MLflow Model ServingDatabricks

Microsoft Machine Learning Server. Architecture ViewDmitry Petukhov

MLflow: A Platform for Production Machine LearningMatei Zaharia

Rest microservice ml_deployment_ntalagala_ai_conf_2019Nisha Talagala

Linear regression on 1 terabytes of data? Some crazy observations and actionsHesen Peng

Mais procurados (20)

Introduction to Azure machine learning

Automatic machine learning (AutoML) 101

201906 01 Introduction to ML.NET 1.0

Azure Machine Learning

16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...

Cerrera DINWC2015

Azure Machine Learning 101

AI with Azure Machine Learning

Insider's introduction to microsoft azure machine learning: 201411 Seattle Bu...

Alex mang patterns for scalability in microsoft azure application

Azure Machine Learning 101

Automate your Machine Learning

Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...

Machine learning systems for engineers

Production machine learning_infrastructure

Productionzing ML Model Using MLflow Model Serving

Microsoft Machine Learning Server. Architecture View

MLflow: A Platform for Production Machine Learning

Rest microservice ml_deployment_ntalagala_ai_conf_2019

Linear regression on 1 terabytes of data? Some crazy observations and actions

Semelhante a Unleashing the Power of Machine Learning with Azure AutoML

Machine learningSaravanan Subburayal

Aws autopilotVivek Raja P S

Getting Started with Azure AutoMLVivek Raja P S

Making Data Science Scalable - 5 Lessons LearnedLaurenz Wuttke

What are the Unique Challenges and Opportunities in Systems for ML?Matei Zaharia

201908 Overview of Automated MLMark Tabladillo

201909 Automated ML for DevelopersMark Tabladillo

Introduction to Machine learning and Deep LearningNishan Aryal

Serverless machine learning architectures at HelixaData Science Milan

Machine Learning and AIJames Serra

Automated machine learning - Global AI night 2019Marco Zamana

MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus

Practical data scienceDing Li

GDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in PracticeJames Anderson

Global AI Bootcamp Madrid - Azure DatabricksAlberto Diaz Martin

Navigating the ML Pipeline Jungle with MLflow: Notes from the Field with Thun...Databricks

Serverless Machine LearningAsavari Tayal

Machine learning at scale - Webinar By zekeLabszekeLabs Technologies

How I became ML Engineer Kevin Lee

Ai & Data Analytics 2018 - Azure Databricks for data scientistAlberto Diaz Martin

Semelhante a Unleashing the Power of Machine Learning with Azure AutoML (20)

Machine learning

Aws autopilot

Getting Started with Azure AutoML

Making Data Science Scalable - 5 Lessons Learned

What are the Unique Challenges and Opportunities in Systems for ML?

201908 Overview of Automated ML

201909 Automated ML for Developers

Introduction to Machine learning and Deep Learning

Serverless machine learning architectures at Helixa

Machine Learning and AI

Automated machine learning - Global AI night 2019

MLOps and Data Quality: Deploying Reliable ML Models in Production

Practical data science

GDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in Practice

Global AI Bootcamp Madrid - Azure Databricks

Navigating the ML Pipeline Jungle with MLflow: Notes from the Field with Thun...

Serverless Machine Learning

Machine learning at scale - Webinar By zekeLabs

How I became ML Engineer

Ai & Data Analytics 2018 - Azure Databricks for data scientist

Último

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal

Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation

Search Engine Optimization SEO PDF for 2024.pdfRankYa

Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxnull - The Open Security Community

Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

WordPress Websites for Engineers: Elevate Your Brandgvaughan

Gen AI in Business - Global Trends Report 2024.pdfAddepto

The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106

Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson

Unleashing the Power of Machine Learning with Azure AutoML

1. #datasatpn February 27th, 2021 Data Saturday #1 Unleashing the Power of Machine Learning Prototyping using Azure AutoML and Python Luca Zavarella

2. Who Am I Luca Zavarella Working with SQL Server since 2007 (BI) Microsoft MVP for Artificial Intelligence Microsoft Certified: Azure Data Scientist Associate DAMAG Founder, ODSC Ambassador AI & ML Practice Director @ Email: lzavarella@lucient.com Twitter: @lucazav LinkedIn: http://it.linkedin.com/in/lucazavarella Blog: medium.com/@lucazav

3. Agenda • ML Prototyping • Machine Learning Process • Azure AutoML • Overview • Validation Types • Algorithms • Featurization • Data Guardrails • Demo • Conclusion

4. Machine Learning Process

5. Components of a BI Project ONE WAY

6. Steps To Build a Machine Learning Solution 1 Problem Framing 2 Get/Prepare Data 3 Develop Model 4 Deploy Model 5 Evaluate / Track Performance 3.1 Analysis/ Metric definition 3.2 Feature Engineering 3.3 Model Training 3.4 Parameter Tuning 3.5 Evaluation

7. What a Machine Learning Project Really Is A Machine Learning project can be viewed as… …a research and development activity …later transformed into a Data Engineering project. Icons by Vectors Market from the Noun Project In short, a complete ML project could take months to implement!

8. Quoting a ML Project Data Exploration You… Feature Selection Customer Could you quote for a ML Proof Of Concept?

9. The Suggested POC Process You Customer STEP 1 1. Define target 2. Understand what data is available 3. Define the schema of the input ML dataset 2-3 days STEP 2 1. Collect all the data 2. Clean the data 3. Provide data with the defined schema X days STEP 3 1. All the ML magic stuff! 2-3 days STEP 4 1. Documentation 2. Presentation of results 1-2 days Fast tool for prototyping

10. AutoML Overview

11. Azure AutoML Workflow

12. AutoML Task Types

13. Primary Metrics

14. AutoML Validation Types

15. Validation Types: Train-Validation Split

16. Validation Types: KFold vs MonteCarlo

17. Validation Types: Auto • For datasets larger than 20,000 rows, the 10% of the initial training data is taken as the validation set. In turn, that validation set is used for metrics calculation. • For datasets smaller than 20,000 rows, the cross-validation approach is applied: 10 folds will be used if the dataset is less than 1000 rows; 3 folds otherwise.

18. Validation Types: Rolling Origin Cross-Validation Only for time-series forecasting

19. AutoML Algorithms

20. Supported Algorithms

21. Ensembling Models: Stacking

22. Ensembling Models: Voting Hard Voting. Predict the class with the largest sum of votes from models Soft Voting. Predict the class with the largest summed probability from models. Classification The prediction that is the average of the prediction of base regressors Regression

23. Ensembling Models in AutoML

24. Featurization in AutoML

25. Featurization: Scaling and Normalization

26. Featurization: Feature Engineering

27. Data Guardrails in AutoML

28. Data Guardrails in the UI Highly imbalanced: ratio of the samples in the least populated class to the samples in the most populated class is less than 20%

29. Data Guardrails Using Python SDK

30. DEMO Auto Insurance Claims Data

31. Conclusion

32. Azure AutoML Strengths • A Python-based technology • Easy integration with your custom pipelines • Data normalizations and basic transformations included • Complex featurization included • Also NLP feature engineering • Ensemble models out of the box • Automatic model explanations

33. Future of Azure AutoML • Only basic imputers implemented • Stratified cross-validation not implemented out of the box • Highly imbalanced datasets not automatically fixed • Explicit feature selection step missing • Neural Networks still not included in training algorithms • Except for ForecastTCN for time-series forecasting

34. References • Probabilistic Matrix Factorization for Automated Machine Learning (https://arxiv.org/abs/1705.05355) • What is automated machine learning (AutoML)? (https://docs.microsoft.com/en-us/azure/machine-learning/concept- automated-ml) • A Review of Azure Automated Machine Learning (AutoML) (https://medium.com/microsoftazure/a-review-of-azure-automated- machine-learning-automl-5d2f98512406 )

35. Thank you!

Unleashing the Power of Machine Learning with Azure AutoML

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Unleashing the Power of Machine Learning with Azure AutoML

Semelhante a Unleashing the Power of Machine Learning with Azure AutoML (20)

Último

Último (20)

Unleashing the Power of Machine Learning with Azure AutoML