SlideShare uma empresa Scribd logo
1 de 21
Machine Learning
101
Fred Verheul
What we won’t cover…
• Deep learning / Neural Networks
• Specifics of ML-algorithms
• Tools / Libraries / Code
• SAP Products, like HANA / Predictive Analytics / Vora / …
• Ethics, algorithmic transparency & fairness
• Hardware
2
Examples: Recommender systems
3
Examples, continued…
4
SPAM-
filtering
Handwriting
recognition
ML in the news: Deepmind’s AlphaGo
5
6
Machine Learning
"Field of study that gives computers the ability to learn
without being explicitly programmed” (Arthur Samuel, 1959)
7
What is Machine Learning?
8
Computer
Computer
Traditional Programming
Machine Learning
Data
Data
Program
Output
Program
Output
Sweet spot for Machine Learning
• It’s impossible to write down the rules in code:
• Too many rules
• Too many factors influencing the rules
• Too finely tuned
• We just don’t know the rules (image recognition)
• Lots of labeled data (examples) available (e.g. historical data)
9
Basic Machine Learning ‘workflow’
10
Feature
Vectors
Training
data
Labels
Machine
Learning
Algorithm
Feature
Vectors
New data Prediction
Training Phase
Operational Phase
Predictive
Model
Training Phase in more detail
11
Raw data
Data
preparation Feature
Vectors
Training
Data
Test
data
Model Building
(by ML
algorithm)
Model
Evaluation
Predictive
Model
Feedback loop
data cleansing
data transformation
normalization
feature extraction
aka
‘learning’
CRISP-DM: data mining process
12
ML
important
ML
important
Examples of ML tasks
Supervised learning
Regression 
target is numeric
Classification 
target is categorical
13
Unsupervised learning
Clustering
Dimensionality
reduction
Modeling: so many algorithms…
14
ML Algorithms: by Representation
Collection of candidate models/programs, aka hypothesis space
15
Decision trees
Instance-based
Neural networks
Model ensembles
ML Algorithms: by Evaluation
Evaluation: Quality measure for a model
16
Regression
Example metric: Root Mean Squared Error
RMSE =
Binary classification: confusion matrix
Accuracy: 8 + 971 -> 97,9%
Example: medical test
for a disease
Positive Negative
P
True
positives
TP
False
Negatives
FN
N
False
positives
FP
True
Negatives
TN
True
Class
Predicted class
Accuracy: Better evaluation metrics:
• Precision: 8 / (8 + 19)
• Recall: 8 / (8 + 2)
Optimization: how the algorithm ‘learns’, depends on representation and
evaluation
ML Algorithms: by Optimization
17
Greedy Search,
ex. of
combinatorial
optimization
Gradient Descent (or in general: Convex Optimization)
Linear Programming (or in general:
Constrained/Nonlinear Optimization)
Training error vs test error
18
Data Science for Business
• Focuses more on general principles
than specific algorithms
• Not math-heavy, does contain some
math
• O’Reilly link:
http://shop.oreilly.com/product/063692
0028918.do
• Book website: http://data-science-for-
biz.com/DSB/Home.html
19
Take-aways
• Goal of ML: generalize from training data (not optimization!!)
• Part of ‘Data Mining Process’, not a goal in and of itself
• No magic! Just some clever algorithms…
• Increasingly important non-technical aspects:
• Ethics
• Algorithmic transparency
20
Thank You
www.soapeople.com
info@soapeople.com
@SOAPEOPLE
Fred Verheul
Big Data Consultant
+31 6 3919 2986
fred.verheul@soapeople.com
@fredverheul

Mais conteúdo relacionado

Mais procurados

Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Sri Ambati
 

Mais procurados (16)

The Evolution of AutoML
The Evolution of AutoMLThe Evolution of AutoML
The Evolution of AutoML
 
ETL & Machine Learning
ETL & Machine LearningETL & Machine Learning
ETL & Machine Learning
 
Visualising the world of competitive programming with Python (Codeforces)
Visualising the world of competitive programming with Python (Codeforces)Visualising the world of competitive programming with Python (Codeforces)
Visualising the world of competitive programming with Python (Codeforces)
 
How is research conducted in my field
How is research conducted in my fieldHow is research conducted in my field
How is research conducted in my field
 
OpenML NeurIPS2018
OpenML NeurIPS2018OpenML NeurIPS2018
OpenML NeurIPS2018
 
H2O World - Ensembles with Erin LeDell
H2O World - Ensembles with Erin LeDellH2O World - Ensembles with Erin LeDell
H2O World - Ensembles with Erin LeDell
 
Microsoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine LearningMicrosoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine Learning
 
AzureML – zero to hero
AzureML – zero to heroAzureML – zero to hero
AzureML – zero to hero
 
AutoML - The Future of AI
AutoML - The Future of AIAutoML - The Future of AI
AutoML - The Future of AI
 
Ideas spracklen-final
Ideas spracklen-finalIdeas spracklen-final
Ideas spracklen-final
 
Pipeline oriented data analytics
Pipeline oriented data analyticsPipeline oriented data analytics
Pipeline oriented data analytics
 
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
 
Part 3 Machine Learnning
Part 3 Machine LearnningPart 3 Machine Learnning
Part 3 Machine Learnning
 
The Quest for an Open Source Data Science Platform
 The Quest for an Open Source Data Science Platform The Quest for an Open Source Data Science Platform
The Quest for an Open Source Data Science Platform
 
Robust and declarative machine learning pipelines for predictive buying at Ba...
Robust and declarative machine learning pipelines for predictive buying at Ba...Robust and declarative machine learning pipelines for predictive buying at Ba...
Robust and declarative machine learning pipelines for predictive buying at Ba...
 
Genetic Algorithm Projects Research Ideas
Genetic Algorithm Projects Research IdeasGenetic Algorithm Projects Research Ideas
Genetic Algorithm Projects Research Ideas
 

Destaque

Animales en peligro de extinción
Animales en peligro de extinciónAnimales en peligro de extinción
Animales en peligro de extinción
cawi_007_0909
 

Destaque (18)

Pasifloras
PasiflorasPasifloras
Pasifloras
 
Qué cambiarías en la educación en
Qué cambiarías en la educación enQué cambiarías en la educación en
Qué cambiarías en la educación en
 
Animales en peligro de extinción
Animales en peligro de extinciónAnimales en peligro de extinción
Animales en peligro de extinción
 
Tbjee Syllabi 2019 - Tripura Jee
Tbjee Syllabi 2019 - Tripura Jee Tbjee Syllabi 2019 - Tripura Jee
Tbjee Syllabi 2019 - Tripura Jee
 
Diseño sistema de sonido hifi con reduccion de sonido
Diseño sistema de sonido hifi con reduccion de sonidoDiseño sistema de sonido hifi con reduccion de sonido
Diseño sistema de sonido hifi con reduccion de sonido
 
El Correo Electronico
El Correo ElectronicoEl Correo Electronico
El Correo Electronico
 
Platanitoren besoa
Platanitoren besoaPlatanitoren besoa
Platanitoren besoa
 
Pga 15 16 ceip san agustín definitiva
Pga 15 16 ceip san agustín definitivaPga 15 16 ceip san agustín definitiva
Pga 15 16 ceip san agustín definitiva
 
SAP HANA SPS10- Predictive Analysis Library and Application Function Modeler
SAP HANA SPS10- Predictive Analysis Library and Application Function ModelerSAP HANA SPS10- Predictive Analysis Library and Application Function Modeler
SAP HANA SPS10- Predictive Analysis Library and Application Function Modeler
 
What's New in SAP HANA SPS 11 Predictive
What's New in SAP HANA SPS 11 PredictiveWhat's New in SAP HANA SPS 11 Predictive
What's New in SAP HANA SPS 11 Predictive
 
Brood en vis
Brood en visBrood en vis
Brood en vis
 
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AGSap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AG
 
Machine Learning, hype or hit?
Machine Learning, hype or hit?Machine Learning, hype or hit?
Machine Learning, hype or hit?
 
SAP Marketing Runs Hybris Marketing By Andreas Starke
SAP Marketing Runs Hybris Marketing By Andreas StarkeSAP Marketing Runs Hybris Marketing By Andreas Starke
SAP Marketing Runs Hybris Marketing By Andreas Starke
 
Programa anual 2016_pfrh
Programa anual 2016_pfrhPrograma anual 2016_pfrh
Programa anual 2016_pfrh
 
Real-Time Supply Chain Analytics with Machine Learning, Kafka, and Spark
Real-Time Supply Chain Analytics with Machine Learning, Kafka, and SparkReal-Time Supply Chain Analytics with Machine Learning, Kafka, and Spark
Real-Time Supply Chain Analytics with Machine Learning, Kafka, and Spark
 
#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...
#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...
#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...
 
Big Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of ThingsBig Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of Things
 

Semelhante a Machine learning 101 sit hvr

Machine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AEMachine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AE
butest
 
Machine Learning - Lecture2.pptx
Machine Learning - Lecture2.pptxMachine Learning - Lecture2.pptx
Machine Learning - Lecture2.pptx
NsitTech
 
It’s all about me_ From big data models to personalized experience Presentation
It’s all about me_ From big data models to personalized experience PresentationIt’s all about me_ From big data models to personalized experience Presentation
It’s all about me_ From big data models to personalized experience Presentation
Yao H. Morin, Ph.D.
 
MachineLearning Seminar PPT.pptx
MachineLearning Seminar PPT.pptxMachineLearning Seminar PPT.pptx
MachineLearning Seminar PPT.pptx
AmanDixit74
 

Semelhante a Machine learning 101 sit hvr (20)

Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-steps
 
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
 
Machine learning
Machine learningMachine learning
Machine learning
 
1. Demystifying ML.pdf
1. Demystifying ML.pdf1. Demystifying ML.pdf
1. Demystifying ML.pdf
 
Machine Learning in NutShell
Machine Learning in NutShellMachine Learning in NutShell
Machine Learning in NutShell
 
artificggggggggggggggialintelligence.pdf
artificggggggggggggggialintelligence.pdfartificggggggggggggggialintelligence.pdf
artificggggggggggggggialintelligence.pdf
 
Machine Learning Contents.pptx
Machine Learning Contents.pptxMachine Learning Contents.pptx
Machine Learning Contents.pptx
 
Machine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AEMachine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AE
 
ML_Module_1.pdf
ML_Module_1.pdfML_Module_1.pdf
ML_Module_1.pdf
 
Machine Learning - Lecture2.pptx
Machine Learning - Lecture2.pptxMachine Learning - Lecture2.pptx
Machine Learning - Lecture2.pptx
 
It’s all about me_ From big data models to personalized experience Presentation
It’s all about me_ From big data models to personalized experience PresentationIt’s all about me_ From big data models to personalized experience Presentation
It’s all about me_ From big data models to personalized experience Presentation
 
MachineLearning Seminar PPT.pptx
MachineLearning Seminar PPT.pptxMachineLearning Seminar PPT.pptx
MachineLearning Seminar PPT.pptx
 
Week 2 Sentiment Analysis Using Machine Learning
Week 2 Sentiment Analysis Using Machine Learning Week 2 Sentiment Analysis Using Machine Learning
Week 2 Sentiment Analysis Using Machine Learning
 
Machine_Learning.pptx
Machine_Learning.pptxMachine_Learning.pptx
Machine_Learning.pptx
 
Drifting Away: Testing ML Models in Production
Drifting Away: Testing ML Models in ProductionDrifting Away: Testing ML Models in Production
Drifting Away: Testing ML Models in Production
 
machine learning
machine learningmachine learning
machine learning
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
 
Machine learning
Machine learning Machine learning
Machine learning
 
An introduction to machine learning and statistics
An introduction to machine learning and statisticsAn introduction to machine learning and statistics
An introduction to machine learning and statistics
 
Machine Learning an Research Overview
Machine Learning an Research OverviewMachine Learning an Research Overview
Machine Learning an Research Overview
 

Último

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 

Último (20)

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 

Machine learning 101 sit hvr

  • 2. What we won’t cover… • Deep learning / Neural Networks • Specifics of ML-algorithms • Tools / Libraries / Code • SAP Products, like HANA / Predictive Analytics / Vora / … • Ethics, algorithmic transparency & fairness • Hardware 2
  • 5. ML in the news: Deepmind’s AlphaGo 5
  • 6. 6
  • 7. Machine Learning "Field of study that gives computers the ability to learn without being explicitly programmed” (Arthur Samuel, 1959) 7
  • 8. What is Machine Learning? 8 Computer Computer Traditional Programming Machine Learning Data Data Program Output Program Output
  • 9. Sweet spot for Machine Learning • It’s impossible to write down the rules in code: • Too many rules • Too many factors influencing the rules • Too finely tuned • We just don’t know the rules (image recognition) • Lots of labeled data (examples) available (e.g. historical data) 9
  • 10. Basic Machine Learning ‘workflow’ 10 Feature Vectors Training data Labels Machine Learning Algorithm Feature Vectors New data Prediction Training Phase Operational Phase Predictive Model
  • 11. Training Phase in more detail 11 Raw data Data preparation Feature Vectors Training Data Test data Model Building (by ML algorithm) Model Evaluation Predictive Model Feedback loop data cleansing data transformation normalization feature extraction aka ‘learning’
  • 12. CRISP-DM: data mining process 12 ML important ML important
  • 13. Examples of ML tasks Supervised learning Regression  target is numeric Classification  target is categorical 13 Unsupervised learning Clustering Dimensionality reduction
  • 14. Modeling: so many algorithms… 14
  • 15. ML Algorithms: by Representation Collection of candidate models/programs, aka hypothesis space 15 Decision trees Instance-based Neural networks Model ensembles
  • 16. ML Algorithms: by Evaluation Evaluation: Quality measure for a model 16 Regression Example metric: Root Mean Squared Error RMSE = Binary classification: confusion matrix Accuracy: 8 + 971 -> 97,9% Example: medical test for a disease Positive Negative P True positives TP False Negatives FN N False positives FP True Negatives TN True Class Predicted class Accuracy: Better evaluation metrics: • Precision: 8 / (8 + 19) • Recall: 8 / (8 + 2)
  • 17. Optimization: how the algorithm ‘learns’, depends on representation and evaluation ML Algorithms: by Optimization 17 Greedy Search, ex. of combinatorial optimization Gradient Descent (or in general: Convex Optimization) Linear Programming (or in general: Constrained/Nonlinear Optimization)
  • 18. Training error vs test error 18
  • 19. Data Science for Business • Focuses more on general principles than specific algorithms • Not math-heavy, does contain some math • O’Reilly link: http://shop.oreilly.com/product/063692 0028918.do • Book website: http://data-science-for- biz.com/DSB/Home.html 19
  • 20. Take-aways • Goal of ML: generalize from training data (not optimization!!) • Part of ‘Data Mining Process’, not a goal in and of itself • No magic! Just some clever algorithms… • Increasingly important non-technical aspects: • Ethics • Algorithmic transparency 20
  • 21. Thank You www.soapeople.com info@soapeople.com @SOAPEOPLE Fred Verheul Big Data Consultant +31 6 3919 2986 fred.verheul@soapeople.com @fredverheul

Notas do Editor

  1. Source for images: http://www.havlena.net/en/machine-learning/machine-learning-what-is-it-where-to-learn-about-it/
  2. Go (DeepMind’s AlphaGo). How it works: https://www.tastehit.com/blog/google-deepmind-alphago-how-it-works/ Go is very different to Chess (DeepBlue 1996). Chess works with a game tree + sophisticated evaluation function. Go is too complex, and there are no good evaluation functions, because Go positions are harder to evaluate. Enter Monte Carlo Tree Search: simulation. Exploration/exploitation trade-off! No Go-knowledge required!
  3. This diagram is attributed to Pedro Domingos who used it in his Coursera Machine Learning course in 2012.
  4. Source: https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining
  5. Sources: Regression - http://gerardnico.com/wiki/data_mining/linear_regression Classification - ?? Clustering - https://en.wikipedia.org/wiki/Cluster_analysis Dimensionality reduction: http://www.sthda.com/english/wiki/factoextra-r-package-easy-multivariate-data-analyses-and-elegant-visualization
  6. Source: http://machinelearningmastery.com/
  7. Sources: Decision Tree - https://en.wikipedia.org/wiki/Decision_tree_learning Instance-based - https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm Neural Networks - https://en.wikipedia.org/wiki/Artificial_neural_network Ensembles - https://www.analyticsvidhya.com/blog/2015/09/questions-ensemble-modeling/
  8. Sources: Greedy Search - https://en.wikipedia.org/wiki/Greedy_algorithm Gradient Descent - ?? Linear Programming - http://courses.wccnet.edu/~palay/math181/linearprogramming.htm
  9. Source: https://onlinecourses.science.psu.edu/stat857/node/160