SlideShare uma empresa Scribd logo
1 de 22
Machine Learning 101
Fred Verheul
Machine Learning
"Field of study that gives computers the ability to learn
without being explicitly programmed” (Arthur Samuel, 1959)
2
What is Machine Learning?
3
Computer
Computer
Traditional Programming
Machine Learning
Data
Data
Program
Output
Program
Output
Prediction is hard…
4
Sweet spot for Machine Learning
• It’s impossible to write down the rules in code:
• Too many rules
• Too many factors influencing the rules
• Too finely tuned
• We just don’t know the rules (image recognition)
• Lots of labeled data (examples) available (e.g. historical data)
5
Basic Machine Learning ‘workflow’
6
Feature
Vectors
Training
data
Labels
Machine
Learning
Algorithm
Feature
Vectors
New data Prediction
Training Phase
Operational Phase
Predictive
Model
Training Phase in more detail
7
Raw data
Data
preparation Feature
Vectors
Training
Data
Test
data
Model Building
(by ML
algorithm)
Model
Evaluation
Predictive
Model
Feedback loop
data cleansing
data transformation
normalization
feature extraction
aka
‘learning’
Examples of ML tasks
Supervised learning
Regression 
target is numeric
Classification 
target is categorical
8
Unsupervised learning
Clustering
Dimensionality
reduction
Modeling: so many algorithms…
9
ML Algorithms: by Representation
Collection of candidate models/programs, aka hypothesis space
10
Decision trees
Instance-based
Neural networks
Model ensembles
ML Algorithms: by Evaluation
Evaluation: Quality measure for a model
11
Regression
Example metric: Root Mean Squared Error
RMSE =
Binary classification: confusion matrix
Accuracy: 8 + 971 -> 97,9%
Example: medical test
for a disease
Positive Negative
P
True
positives
TP
False
Negatives
FN
N
False
positives
FP
True
Negatives
TN
True
Class
Predicted class
Accuracy: Better evaluation metrics:
• Precision: 8 / (8 + 19)
• Recall: 8 / (8 + 2)
Optimization: how the algorithm ‘learns’, depends on representation and
evaluation
ML Algorithms: by Optimization
12
Greedy Search,
ex. of
combinatorial
optimization
Gradient Descent (or in general: Convex Optimization)
Linear Programming (or in general:
Constrained/Nonlinear Optimization)
Training error vs test error
13
Data Science for Business
• Focuses more on general principles
than specific algorithms
• Not math-heavy, does contain some
math
• O’Reilly link:
http://shop.oreilly.com/product/063692
0028918.do
• Book website: http://data-science-for-
biz.com/DSB/Home.html
14
What has NOT been covered (1)
• Deep learning / Neural Networks
• Covered in other presentations at DKOM
• Also recommended for further reading (deep dive):
• http://neuralnetworksanddeeplearning.com/index.html
• Specifics of ML-algorithms
• All over the internet… e.g. at http://machinelearningmastery.com/
15
What has NOT been covered (2)
• Libraries (examples):
• Tensorflow, Caffe, Theano, Keras
• SciPy & scikit-learn
• Spark MLLib (Scala/Java/Python)
• Programming languages:
16
What has NOT been covered (3)
• SAP products:
• SAP HANA, SAP HANA Vora, SAP
BO Predictive Analytics(!), HCP
Predictive Services
• New machine learning platform
• Hardware
• Nvidia talk about GPUs
17
What has NOT been covered (4)
• Ethics and algorithmic
transparency:
18
What has NOT been covered (5)
• The Data Science &
Data Mining Process:
19
What has NOT been covered (6)
• How to integrate ML into your business
application
• I hope SAP is figuring that out as we speak ;-)
• Have a look at SAP Predictive Analytics Integrator
• https://help.sap.com/pai
20
Take-aways
• Goal of ML: generalize from training data (not optimization!!)
• No magic! Just some clever algorithms…
• Increasingly important non-technical aspects:
• Ethics
• Algorithmic transparency
21
Thank You
www.soapeople.com
info@soapeople.com
@SOAPEOPLE
Fred Verheul
Big Data Consultant
+31 6 3919 2986
fred.verheul@soapeople.com

Mais conteúdo relacionado

Mais procurados

Microsoft azure machine learning
Microsoft azure machine learningMicrosoft azure machine learning
Microsoft azure machine learningAmol Gholap
 
Towards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTowards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTuri, Inc.
 
Data Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area MLData Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area MLPaco Nathan
 
OSCON 2014: Data Workflows for Machine Learning
OSCON 2014: Data Workflows for Machine LearningOSCON 2014: Data Workflows for Machine Learning
OSCON 2014: Data Workflows for Machine LearningPaco Nathan
 
Azure Machine Learning and ML on Premises
Azure Machine Learning and ML on PremisesAzure Machine Learning and ML on Premises
Azure Machine Learning and ML on PremisesIvo Andreev
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine LearningMostafa
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learningsafa cimenli
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine LearningYuriy Guts
 
General Tips for participating Kaggle Competitions
General Tips for participating Kaggle CompetitionsGeneral Tips for participating Kaggle Competitions
General Tips for participating Kaggle CompetitionsMark Peng
 
Ferruzza g automl deck
Ferruzza g   automl deckFerruzza g   automl deck
Ferruzza g automl deckEric Dill
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixJustin Basilico
 
Machine Learning for Everyone
Machine Learning for EveryoneMachine Learning for Everyone
Machine Learning for EveryoneAly Abdelkareem
 
Square's Machine Learning Infrastructure and Applications - Rong Yan
Square's Machine Learning Infrastructure and Applications - Rong YanSquare's Machine Learning Infrastructure and Applications - Rong Yan
Square's Machine Learning Infrastructure and Applications - Rong YanHakka Labs
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsSri Ambati
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkIvo Andreev
 

Mais procurados (20)

Debugging machine-learning
Debugging machine-learningDebugging machine-learning
Debugging machine-learning
 
Microsoft azure machine learning
Microsoft azure machine learningMicrosoft azure machine learning
Microsoft azure machine learning
 
Towards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTowards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning Benchmark
 
Data Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area MLData Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area ML
 
OSCON 2014: Data Workflows for Machine Learning
OSCON 2014: Data Workflows for Machine LearningOSCON 2014: Data Workflows for Machine Learning
OSCON 2014: Data Workflows for Machine Learning
 
Azure Machine Learning and ML on Premises
Azure Machine Learning and ML on PremisesAzure Machine Learning and ML on Premises
Azure Machine Learning and ML on Premises
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine Learning
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
Machine Learning Goes Production
Machine Learning Goes ProductionMachine Learning Goes Production
Machine Learning Goes Production
 
Exposé Ontology
Exposé OntologyExposé Ontology
Exposé Ontology
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
General Tips for participating Kaggle Competitions
General Tips for participating Kaggle CompetitionsGeneral Tips for participating Kaggle Competitions
General Tips for participating Kaggle Competitions
 
Ferruzza g automl deck
Ferruzza g   automl deckFerruzza g   automl deck
Ferruzza g automl deck
 
OpenML 2019
OpenML 2019OpenML 2019
OpenML 2019
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
Machine Learning for Everyone
Machine Learning for EveryoneMachine Learning for Everyone
Machine Learning for Everyone
 
Square's Machine Learning Infrastructure and Applications - Rong Yan
Square's Machine Learning Infrastructure and Applications - Rong YanSquare's Machine Learning Infrastructure and Applications - Rong Yan
Square's Machine Learning Infrastructure and Applications - Rong Yan
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner Pitfalls
 
Learning how to learn
Learning how to learnLearning how to learn
Learning how to learn
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 

Destaque

SAP d-kom 2017 Karlsruhe - The Challenge of the „two speed“ SAP Ecosystem
SAP d-kom 2017 Karlsruhe - The Challenge of the „two speed“ SAP EcosystemSAP d-kom 2017 Karlsruhe - The Challenge of the „two speed“ SAP Ecosystem
SAP d-kom 2017 Karlsruhe - The Challenge of the „two speed“ SAP EcosystemMartin Fischer
 
SAP HANA SPS10- Predictive Analysis Library and Application Function Modeler
SAP HANA SPS10- Predictive Analysis Library and Application Function ModelerSAP HANA SPS10- Predictive Analysis Library and Application Function Modeler
SAP HANA SPS10- Predictive Analysis Library and Application Function ModelerSAP Technology
 
What's New in SAP HANA SPS 11 Predictive
What's New in SAP HANA SPS 11 PredictiveWhat's New in SAP HANA SPS 11 Predictive
What's New in SAP HANA SPS 11 PredictiveSAP Technology
 
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AGSap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AGINDUSCommunity
 
Machine Learning, hype or hit?
Machine Learning, hype or hit?Machine Learning, hype or hit?
Machine Learning, hype or hit?fredverheul
 
Intel's Machine Learning Strategy
Intel's Machine Learning StrategyIntel's Machine Learning Strategy
Intel's Machine Learning Strategyinside-BigData.com
 
SAP Marketing Runs Hybris Marketing By Andreas Starke
SAP Marketing Runs Hybris Marketing By Andreas StarkeSAP Marketing Runs Hybris Marketing By Andreas Starke
SAP Marketing Runs Hybris Marketing By Andreas StarkeMarTech Conference
 
Real-Time Supply Chain Analytics with Machine Learning, Kafka, and Spark
Real-Time Supply Chain Analytics with Machine Learning, Kafka, and SparkReal-Time Supply Chain Analytics with Machine Learning, Kafka, and Spark
Real-Time Supply Chain Analytics with Machine Learning, Kafka, and SparkSingleStore
 
#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...
#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...
#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...SAP Analytics
 
Big Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of ThingsBig Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of ThingsAnthony Chen
 
Scrum! But ... SAP Inside Track Frankfurt 2017
Scrum! But ... SAP Inside Track Frankfurt 2017Scrum! But ... SAP Inside Track Frankfurt 2017
Scrum! But ... SAP Inside Track Frankfurt 2017Martin Fischer
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...Sebastian Raschka
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceLukas Masuch
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningLars Marius Garshol
 
3 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 20173 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 2017Drift
 

Destaque (16)

SAP d-kom 2017 Karlsruhe - The Challenge of the „two speed“ SAP Ecosystem
SAP d-kom 2017 Karlsruhe - The Challenge of the „two speed“ SAP EcosystemSAP d-kom 2017 Karlsruhe - The Challenge of the „two speed“ SAP Ecosystem
SAP d-kom 2017 Karlsruhe - The Challenge of the „two speed“ SAP Ecosystem
 
SAP HANA SPS10- Predictive Analysis Library and Application Function Modeler
SAP HANA SPS10- Predictive Analysis Library and Application Function ModelerSAP HANA SPS10- Predictive Analysis Library and Application Function Modeler
SAP HANA SPS10- Predictive Analysis Library and Application Function Modeler
 
What's New in SAP HANA SPS 11 Predictive
What's New in SAP HANA SPS 11 PredictiveWhat's New in SAP HANA SPS 11 Predictive
What's New in SAP HANA SPS 11 Predictive
 
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AGSap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AG
 
Machine Learning, hype or hit?
Machine Learning, hype or hit?Machine Learning, hype or hit?
Machine Learning, hype or hit?
 
Intel's Machine Learning Strategy
Intel's Machine Learning StrategyIntel's Machine Learning Strategy
Intel's Machine Learning Strategy
 
SAP Marketing Runs Hybris Marketing By Andreas Starke
SAP Marketing Runs Hybris Marketing By Andreas StarkeSAP Marketing Runs Hybris Marketing By Andreas Starke
SAP Marketing Runs Hybris Marketing By Andreas Starke
 
Real-Time Supply Chain Analytics with Machine Learning, Kafka, and Spark
Real-Time Supply Chain Analytics with Machine Learning, Kafka, and SparkReal-Time Supply Chain Analytics with Machine Learning, Kafka, and Spark
Real-Time Supply Chain Analytics with Machine Learning, Kafka, and Spark
 
#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...
#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...
#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...
 
Big Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of ThingsBig Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of Things
 
SAP Organization Structure
SAP Organization StructureSAP Organization Structure
SAP Organization Structure
 
Scrum! But ... SAP Inside Track Frankfurt 2017
Scrum! But ... SAP Inside Track Frankfurt 2017Scrum! But ... SAP Inside Track Frankfurt 2017
Scrum! But ... SAP Inside Track Frankfurt 2017
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
 
3 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 20173 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 2017
 

Semelhante a Machine learning 101 dkom 2017

MachineLearningSparkML.pptx
MachineLearningSparkML.pptxMachineLearningSparkML.pptx
MachineLearningSparkML.pptxharikaramisetty3
 
MachineLearningSparkML.pptx
MachineLearningSparkML.pptxMachineLearningSparkML.pptx
MachineLearningSparkML.pptxsnigdhaagrawal11
 
MachineLearningSparkML.pptx
MachineLearningSparkML.pptxMachineLearningSparkML.pptx
MachineLearningSparkML.pptxAbderrahmanABID2
 
MachineLearningSparkML AI and expert Systems
MachineLearningSparkML AI and expert SystemsMachineLearningSparkML AI and expert Systems
MachineLearningSparkML AI and expert Systemsshreenathji26
 
Making Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons LearnedMaking Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons LearnedLaurenz Wuttke
 
Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning CCG
 
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...Lucas Jellema
 
(Faiz) MachineLearning(ppt).pptx
(Faiz) MachineLearning(ppt).pptx(Faiz) MachineLearning(ppt).pptx
(Faiz) MachineLearning(ppt).pptxFaiz430036
 
Ideas spracklen-final
Ideas spracklen-finalIdeas spracklen-final
Ideas spracklen-finalsupportlogic
 
Module III MachineLearningSparkML.pptx
Module III MachineLearningSparkML.pptxModule III MachineLearningSparkML.pptx
Module III MachineLearningSparkML.pptxRahul Borate
 
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MAHIRA
 
Walk through of azure machine learning studio new features
Walk through of azure machine learning studio new featuresWalk through of azure machine learning studio new features
Walk through of azure machine learning studio new featuresLuca Zavarella
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning InfrastructureSigOpt
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...Big Data Spain
 
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)Yury Leonychev
 
Nisha talagala keynote_inflow_2016
Nisha talagala keynote_inflow_2016Nisha talagala keynote_inflow_2016
Nisha talagala keynote_inflow_2016Nisha Talagala
 

Semelhante a Machine learning 101 dkom 2017 (20)

MachineLearningSparkML.pptx
MachineLearningSparkML.pptxMachineLearningSparkML.pptx
MachineLearningSparkML.pptx
 
MachineLearningSparkML.pptx
MachineLearningSparkML.pptxMachineLearningSparkML.pptx
MachineLearningSparkML.pptx
 
MachineLearningSparkML.pptx
MachineLearningSparkML.pptxMachineLearningSparkML.pptx
MachineLearningSparkML.pptx
 
MachineLearningSparkML AI and expert Systems
MachineLearningSparkML AI and expert SystemsMachineLearningSparkML AI and expert Systems
MachineLearningSparkML AI and expert Systems
 
MachineLearningSparkML.pptx
MachineLearningSparkML.pptxMachineLearningSparkML.pptx
MachineLearningSparkML.pptx
 
Making Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons LearnedMaking Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons Learned
 
Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning
 
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
 
(Faiz) MachineLearning(ppt).pptx
(Faiz) MachineLearning(ppt).pptx(Faiz) MachineLearning(ppt).pptx
(Faiz) MachineLearning(ppt).pptx
 
Ideas spracklen-final
Ideas spracklen-finalIdeas spracklen-final
Ideas spracklen-final
 
Module III MachineLearningSparkML.pptx
Module III MachineLearningSparkML.pptxModule III MachineLearningSparkML.pptx
Module III MachineLearningSparkML.pptx
 
Intro_2.ppt
Intro_2.pptIntro_2.ppt
Intro_2.ppt
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
 
Walk through of azure machine learning studio new features
Walk through of azure machine learning studio new featuresWalk through of azure machine learning studio new features
Walk through of azure machine learning studio new features
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
 
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
 
Nisha talagala keynote_inflow_2016
Nisha talagala keynote_inflow_2016Nisha talagala keynote_inflow_2016
Nisha talagala keynote_inflow_2016
 

Último

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Hararemasabamasaba
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 

Último (20)

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 

Machine learning 101 dkom 2017

  • 2. Machine Learning "Field of study that gives computers the ability to learn without being explicitly programmed” (Arthur Samuel, 1959) 2
  • 3. What is Machine Learning? 3 Computer Computer Traditional Programming Machine Learning Data Data Program Output Program Output
  • 5. Sweet spot for Machine Learning • It’s impossible to write down the rules in code: • Too many rules • Too many factors influencing the rules • Too finely tuned • We just don’t know the rules (image recognition) • Lots of labeled data (examples) available (e.g. historical data) 5
  • 6. Basic Machine Learning ‘workflow’ 6 Feature Vectors Training data Labels Machine Learning Algorithm Feature Vectors New data Prediction Training Phase Operational Phase Predictive Model
  • 7. Training Phase in more detail 7 Raw data Data preparation Feature Vectors Training Data Test data Model Building (by ML algorithm) Model Evaluation Predictive Model Feedback loop data cleansing data transformation normalization feature extraction aka ‘learning’
  • 8. Examples of ML tasks Supervised learning Regression  target is numeric Classification  target is categorical 8 Unsupervised learning Clustering Dimensionality reduction
  • 9. Modeling: so many algorithms… 9
  • 10. ML Algorithms: by Representation Collection of candidate models/programs, aka hypothesis space 10 Decision trees Instance-based Neural networks Model ensembles
  • 11. ML Algorithms: by Evaluation Evaluation: Quality measure for a model 11 Regression Example metric: Root Mean Squared Error RMSE = Binary classification: confusion matrix Accuracy: 8 + 971 -> 97,9% Example: medical test for a disease Positive Negative P True positives TP False Negatives FN N False positives FP True Negatives TN True Class Predicted class Accuracy: Better evaluation metrics: • Precision: 8 / (8 + 19) • Recall: 8 / (8 + 2)
  • 12. Optimization: how the algorithm ‘learns’, depends on representation and evaluation ML Algorithms: by Optimization 12 Greedy Search, ex. of combinatorial optimization Gradient Descent (or in general: Convex Optimization) Linear Programming (or in general: Constrained/Nonlinear Optimization)
  • 13. Training error vs test error 13
  • 14. Data Science for Business • Focuses more on general principles than specific algorithms • Not math-heavy, does contain some math • O’Reilly link: http://shop.oreilly.com/product/063692 0028918.do • Book website: http://data-science-for- biz.com/DSB/Home.html 14
  • 15. What has NOT been covered (1) • Deep learning / Neural Networks • Covered in other presentations at DKOM • Also recommended for further reading (deep dive): • http://neuralnetworksanddeeplearning.com/index.html • Specifics of ML-algorithms • All over the internet… e.g. at http://machinelearningmastery.com/ 15
  • 16. What has NOT been covered (2) • Libraries (examples): • Tensorflow, Caffe, Theano, Keras • SciPy & scikit-learn • Spark MLLib (Scala/Java/Python) • Programming languages: 16
  • 17. What has NOT been covered (3) • SAP products: • SAP HANA, SAP HANA Vora, SAP BO Predictive Analytics(!), HCP Predictive Services • New machine learning platform • Hardware • Nvidia talk about GPUs 17
  • 18. What has NOT been covered (4) • Ethics and algorithmic transparency: 18
  • 19. What has NOT been covered (5) • The Data Science & Data Mining Process: 19
  • 20. What has NOT been covered (6) • How to integrate ML into your business application • I hope SAP is figuring that out as we speak ;-) • Have a look at SAP Predictive Analytics Integrator • https://help.sap.com/pai 20
  • 21. Take-aways • Goal of ML: generalize from training data (not optimization!!) • No magic! Just some clever algorithms… • Increasingly important non-technical aspects: • Ethics • Algorithmic transparency 21
  • 22. Thank You www.soapeople.com info@soapeople.com @SOAPEOPLE Fred Verheul Big Data Consultant +31 6 3919 2986 fred.verheul@soapeople.com

Notas do Editor

  1. This diagram is attributed to Pedro Domingos who used it in his Coursera Machine Learning course in 2012.
  2. Source: http://timoelliott.com/blog/2007/11/thanksgiving_predictive_analyt.html
  3. Sources: Regression - http://gerardnico.com/wiki/data_mining/linear_regression Classification - ?? Clustering - https://en.wikipedia.org/wiki/Cluster_analysis Dimensionality reduction: http://www.sthda.com/english/wiki/factoextra-r-package-easy-multivariate-data-analyses-and-elegant-visualization
  4. Source: http://machinelearningmastery.com/
  5. Sources: Decision Tree - https://en.wikipedia.org/wiki/Decision_tree_learning Instance-based - https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm Neural Networks - https://en.wikipedia.org/wiki/Artificial_neural_network Ensembles - https://www.analyticsvidhya.com/blog/2015/09/questions-ensemble-modeling/
  6. Sources: Greedy Search - https://en.wikipedia.org/wiki/Greedy_algorithm Gradient Descent - ?? Linear Programming - http://courses.wccnet.edu/~palay/math181/linearprogramming.htm
  7. Source: https://onlinecourses.science.psu.edu/stat857/node/160