SlideShare uma empresa Scribd logo
1 de 51
Baixar para ler offline
1st edition
March 7-8, 2019
BigML, Inc
ML: Technical Perspective
What is the Big Deal?
Poul Petersen
CIO, BigML
!2
BigML, Inc #MLSEV: ML a Technical Perspective
Sampling the Audience
!3
Expert: Published papers at KDD, ICML, NIPS, etc or
developed own ML algorithms used at large scale
Aficionado: Understands pros/cons of different
techniques and/or can tweak algorithms as needed
Practitioner: Very familiar with ML packages (Weka,
Scikit, BigML, etc.)
Newbie: Just taking Coursera ML class or reading an
introductory book to ML
Absolute beginner: ML sounds like science fiction
BigML, Inc #MLSEV: ML a Technical Perspective
A Present for You
!4
BigML, Inc #MLSEV: ML a Technical Perspective
Free 1-Month Boosted Subscription
!5
https://bigml.com/accounts/register/
MLSEV
BigML, Inc #MLSEV
What is Machine Learning?
!6
BigML, Inc #MLSEV: ML a Technical Perspective
What is Machine Learning?
!7
Let’s start with what is NOT Machine Learning…
• Sentience
• Killer robots
• Generalized Artificial Intelligence
• Anything to do with the word “singularity”
BigML, Inc #MLSEV: ML a Technical Perspective
Oh the Hype!
!8
AlphaGo Zero beats a human at Go… killer robots far off?
• First of all, AlphaGo Zero is impressive!
• But, no need to fear killer robots power by AlphaGo Zero:
• Learning is not transferrable: retrain for chess, etc.
• Works only for rule based systems / perfect simulator
• Relies on games/systems with clear objectives (win/lose)
• Cost $25 million1
“While AlphaGo Zero is a step towards a general-purpose AI, it can only work on
problems that can be perfectly simulated in a computer, making tasks such as
driving a car out of the question. AIs that match humans at a huge range of
tasks are still a long way off” - Demis Hassabis, CEO of DeepMind2
2. https://www.theguardian.com/science/2017/oct/18/its-able-to-create-knowledge-itself-google-unveils-ai-learns-all-on-its-own
1. https://www.inc.com/lisa-calhoun/google-artificial-intelligence-alpha-go-zero-just-pressed-reset-on-how-we-learn.html
BigML, Inc #MLSEV: ML a Technical Perspective
Three Domains
!9
Artificial
Intelligence
Cool/Scary things…
that mostly don’t exist
Machine
Learning
AI Concepts applied to
very specific problems
Deep
Learning
Specific techniques of
Machine Learning
BigML, Inc #MLSEV: ML a Technical Perspective
What is Machine Learning?
!10
Let’s start with what is NOT Machine Learning…
• Sentience
• Killer robots
• Generalized Artificial Intelligence
• Anything to do with the word “singularity”
• Something “new”
• First International Conference on ML held in 1980
• Top-performing algorithms have been around for decades
How do these things relate?
BigML, Inc #MLSEV: ML a Technical Perspective
AIRLINE ORIGIN DESTINATION
DEPARTURE
DELAY
DISTANCE
ARRIVAL
DELAY
AS ANC SEA -11 1448,0 -22
AA LAX PBI -8 2330,0 -9
US SFO CLT -2 2296,0 5
AA LAX MIA -5 2342,0 -9
AS SEA ANC -1 1448,0 -21
DL SFO MSP -5 1589 8
NK LAS MSP -6 1299 -17
US LAX CLT 14 2125,0 -10
AA SFO DFW -11 1464,0 -13
DL LAS ATL 3 1747,0 -15
What is Machine Learning?
!11
Finding patterns in data that can be used to
make inferences
Predictive Models
A practical definition…
BigML, Inc #MLSEV: ML a Technical Perspective
Machine Learning Terminology
!12
Instances
Features
New Instance
Predictive model
Prediction
Confidence
ML algorithm
Label
Training / Learning Predicting / Scoring
Data
BigML, Inc #MLSEV
Why Machine Learning?
!13
BigML, Inc #MLSEV: ML a Technical Perspective
Why Machine Learning
!14
COMPLEXITYOFTASKS
TIME20th century 21st century
-
+
BigML, Inc #MLSEV: ML a Technical Perspective
Traditional Programming
!15
Lost Baggage Policy
• Explicit rules defined by requirements and experience
• How do we program when the rules are unknown or
very difficult to determine?
BigML, Inc #MLSEV: ML a Technical Perspective
Programming with ML
!16
AIRLINE ORIGIN DESTINATION
DEPARTURE
DELAY
DISTANCE
ARRIVAL
DELAY
AS ANC SEA -11 1448,0 -22
AA LAX PBI -8 2330,0 -9
US SFO CLT -2 2296,0 5
AA LAX MIA -5 2342,0 -9
AS SEA ANC -1 1448,0 -21
DL SFO MSP -5 1589 8
NK LAS MSP -6 1299 -17
US LAX CLT 14 2125,0 -10
AA SFO DFW -11 1464,0 -13
DL LAS ATL 3 1747,0 -15
Want: Flight Delay Prediction
Flight Delay Model????
What else can ML do?
BigML, Inc #MLSEV
Machine Learning Tasks
!17
BigML, Inc #MLSEV: ML a Technical Perspective
Machine Learning Tasks
!18
CLUSTER
ANALYSIS
ANOMALY
DETECTION
ASSOCIATION
DISCOVERY
TOPIC MODELING
TIME SERIES
UNSUPERVISED
CLASSIFICATION AND REGRESSION
SUPERVISED
BigML, Inc #MLSEV: ML a Technical Perspective
Predictive Maintenance
!19
CLASSIFICATION Will this component fail?
REGRESSION How many days until this component fails?
TIME SERIES FORECASTING How many components will fail in a week from now?
CLUSTER ANALYSIS Which machines behave similarly?
ANOMALY DETECTION Is this behavior normal?
ASSOCIATION DISCOVERY What alerts are triggered together before a failure?
BigML, Inc #MLSEV: ML a Technical Perspective
Personalized Music
!20
CLASSIFICATION Will this song be a hit?
REGRESSION How many users will play this song next month?
TIME SERIES FORECASTING
How many downloads this song will have in 3
months?
CLUSTER ANALYSIS Which songs are similar?
ANOMALY DETECTION Is this song being played more than normal?
ASSOCIATION DISCOVERY What songs people like to play together?
BigML, Inc #MLSEV: ML a Technical Perspective
Airline Revenue Management
!21
CLASSIFICATION Will this flight be booked at 80% 14 days out?
REGRESSION
How many passengers will book this flight 7 days
out?
TIME SERIES FORECASTING How many tickets will be cancelled this week?
CLUSTER ANALYSIS Which flight booking patterns are similar?
ANOMALY DETECTION Are these flights booking patterns normal?
ASSOCIATION DISCOVERY What price changes help overbook sooner?
BigML, Inc #MLSEV: ML a Technical Perspective
Network Security
!22
CLASSIFICATION Is this email part of a phishing attack?
REGRESSION How many logins after work per week?
TIME SERIES FORECASTING What will be the number of false alarms next week?
CLUSTER ANALYSIS Are these users behaving similarly?
ANOMALY DETECTION Is this user behavior worth to inspect?
ASSOCIATION DISCOVERY What alerts were triggered before this attack?
BigML, Inc #MLSEV
All ML Models are Wrong
!23
BigML, Inc #MLSEV: ML a Technical Perspective
All ML Models are WRONG
!24
TRUE FALSE
DEEPNET ENSEMBLELOGISTIC
REGRESION
DECISION TREE
Some model(s) is wrong… which one?
Same patient… different models… different predictions!
Insight: Need a way to measure model fitness
BigML, Inc #MLSEV: ML a Technical Perspective
Evaluating Models
!25
TEST
TRAINING
CONFIDENCEPREDICTION
%
EVALUATION
%
ENSEMBLE
PATIENT DATA
Stay Tuned: You will see this in Evaluations
BigML, Inc #MLSEV: ML a Technical Perspective
Measuring ML Mistakes
!26
TRUE FALSE
TRUE
TRUE
POSITIVE
FALSE
POSITIVE
FALSE
FALSE
NEGATIVE
TRUE
NEGATIVE
MODEL
ACTUAL
We can bend the rules a bit…
BigML, Inc #MLSEV: ML a Technical Perspective
Operating Point
!27
TRUE
FALSE
100% 0%
0% 100%
Operating Point
More False Positives More False Negatives
Why would you do this?
BigML, Inc #MLSEV: ML a Technical Perspective
Comparing Models
!28
%TRUEPOSITIVES
% FALSE POSITIVES
WORST(?) MODEL
IDEAL MODEL
GOOD
BETTER
R
AN
D
O
M
TRIVIAL MODEL
TRIVIAL MODEL
BigML, Inc #MLSEV: ML a Technical Perspective
Mistakes can be Costly
!29
+ =
FUN!
DANGER!
BigML, Inc #MLSEV: ML a Technical Perspective
Cost Functions
!30
GOOD
BETTER?%TRUEPOSITIVES
% FALSE POSITIVES
• What is the cost of predicting cancer incorrectly?
• What is the cost of labeling a fraudulent transaction as valid?
• What is the cost of incorrectly predicting an aircraft part is safe?
• Why can’t I just have a perfect model?
FALSE NEGATIVE COST
FALSE POSITIVE COST
One possibility
BigML, Inc #MLSEV: ML a Technical Perspective
How it Goes All Wrong
!31
• Over-fitting
• Under-fitting
BigML, Inc #MLSEV: ML a Technical Perspective
Hunting Dog Image Classifier
!32
TRU
E
FAL
SE
Which images are pictures of dogs that are
bred to be hunters?
BigML, Inc #MLSEV: ML a Technical Perspective
Over-fitting…
!33
“Hunting dogs are short-
haired spotted puppies that
lay out on the grass”
BigML, Inc #MLSEV: ML a Technical Perspective
Title
!34
A perfect model! How about some new images…
TRU
E
FAL
SE
BigML, Inc #MLSEV: ML a Technical Perspective
Over-fitting
!35
Model: true
Reality: false
Model: false
Reality: true
• This is an example or poor generalization
• The model “fit” the training data perfectly
• But it does not generalize to new instances well
BigML, Inc #MLSEV: ML a Technical Perspective
Under-fitting
!36
“Dogs with drop or pendant
ears are hunters”
Only use ear shape:
BigML, Inc #MLSEV: ML a Technical Perspective
Title
!37
An imperfect model… now we are making some
mistakes on the training data.
TRU
E
FAL
SE
BigML, Inc #MLSEV: ML a Technical Perspective
Under-fitting
!38
• This is an example of good generalization
• The model “under-fit” the training data
• But it is generalizing to new instances better
Model: true
Reality: true
Model: false
Reality: false
BigML, Inc #MLSEV: ML a Technical Perspective
Under-fitting
!39
Model: false
Reality: true
Model: false
Reality: true
BigML, Inc #MLSEV: ML a Technical Perspective
Learning Problems / Complexity
!40
Under-fitting Over-fitting
• High Complexity Model
• Fitting the data too well
One way to mitigate this is with different types of models…
• Low Complexity Model
• Not fitting the data very well
BigML, Inc #MLSEV: ML a Technical Perspective
Choosing the ML Algorithm
!41
Decreasing Interpretability / Better Representation / Longer Training
IncreasingDataSize/Complexity
Early Stage

Rapid Prototyping
Mid Stage

Proven Application
Late Stage

Critical Performance
DeepnetsSingle Tree Model
Logistic Regression Boosted Trees
Random

Decision Forest
Decision Forest
Hard?
BigML, Inc #MLSEV
Automating Machine Learning
!42
BigML, Inc #MLSEV: ML a Technical Perspective
Deepnet Structure
!43
x1 x2 x3 x4
y1 y2 y3Outputs
Inputs
h1 h2 h3 h4 h5 Hidden layer
3 Classes
4 Features
h1 h2 h3 h4 h5 Hidden layer
h1 h2 h3 h4 h9 Hidden layer….
h1 = activation?(wx, x) ?
BigML, Inc #MLSEV: ML a Technical Perspective
BigML Deepnet
!44
• The success of a Deepnet is dependent on getting the right
network structure for the dataset
• But, there are too many parameters:
• Nodes, layers, activation function, learning rate, etc…
• And setting them takes significant expert knowledge
• Solution: Metalearning (a good initial guess)
• Solution: Network search (try a bunch)
BigML, Inc #MLSEV: ML a Technical Perspective
Automating Machine Learning
!45
http://www.clparker.org/ml_benchmark/
BigML, Inc #MLSEV: ML a Technical Perspective
Automating Machine Learning
!46
• Each resource has several parameters that impact quality
• Number of trees, missing splits, nodes, weight
• Rather than trial and error, we can use ML to find ideal
parameters
• Why not make the model type, Decision Tree, Boosted Tree,
etc, a parameter as well?
• Similar to Deepnet network search, but finds the optimum
machine learning algorithm and parameters for your data
automatically
Key Insight: We can solve any parameter selection
problem in a similar way.
BigML, Inc #MLSEV: ML a Technical Perspective
BigML OptiML
!47
BigML, Inc #MLSEV: ML a Technical Perspective
Fusions
!48
Key Insight: ML algorithms each have unique
strengths and weaknesses
Single Tree: output changes abruptly
with inputs near decision boundary
Tree + Deepnet: output changes smoothly
with inputs near decision boundary
BigML, Inc #MLSEV: ML a Technical Perspective
Fusions
!49
Model Skills: Some ML algorithms “generally” do better
on some feature types:
• RDF for sparse text vectors

• LR/Deepnets for numeric features

• Trees for categorical features
Full
Numeric
Text
BigML, Inc #MLSEV: ML a Technical Perspective
Summary
!50
• Machine Learning is a subset of “Artificial Intelligence”
• Finds patterns in data that can be used to make inferences
• Can be thought of as “programming with data”
• Has been around for a long time (only recently practical)
• Already being used to solve real-world problems
• Caveat Emptor:
• Machine Learning mistakes are expected
• Care must be taken to address the cost of mistakes
• Automating Machine Learning
• Powerful application of ML to parameterizing ML
• Models can be fused to address specific data complexities
MLSEV. Machine Learning: Technical Perspective

Mais conteúdo relacionado

Semelhante a MLSEV. Machine Learning: Technical Perspective

DutchMLSchool. ML Business Perspective
DutchMLSchool. ML Business PerspectiveDutchMLSchool. ML Business Perspective
DutchMLSchool. ML Business PerspectiveBigML, Inc
 
From DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transitionFrom DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transitionAnne-Marie Tousch
 
BSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, EvaluationsBSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, EvaluationsBigML, Inc
 
BSSML16 L1. Introduction, Models, and Evaluations
BSSML16 L1. Introduction, Models, and EvaluationsBSSML16 L1. Introduction, Models, and Evaluations
BSSML16 L1. Introduction, Models, and EvaluationsBigML, Inc
 
DutchMLSchool. Introduction to Machine Learning with the BigML Platform
DutchMLSchool. Introduction to Machine Learning with the BigML PlatformDutchMLSchool. Introduction to Machine Learning with the BigML Platform
DutchMLSchool. Introduction to Machine Learning with the BigML PlatformBigML, Inc
 
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tDefcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tpseudor00t overflow
 
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019Dhiana Deva
 
Myth vs Reality: Understanding AI/ML for QA Automation - w/ Jonathan Lipps
Myth vs Reality: Understanding AI/ML for QA Automation - w/ Jonathan LippsMyth vs Reality: Understanding AI/ML for QA Automation - w/ Jonathan Lipps
Myth vs Reality: Understanding AI/ML for QA Automation - w/ Jonathan LippsApplitools
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Alok Singh
 
A few Challenges to Make Machine Learning Easy
A few Challenges to Make Machine Learning EasyA few Challenges to Make Machine Learning Easy
A few Challenges to Make Machine Learning EasyPemo Theodore
 
DutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLDutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLBigML, Inc
 
MLSEV. Automating Decision Making
MLSEV. Automating Decision MakingMLSEV. Automating Decision Making
MLSEV. Automating Decision MakingBigML, Inc
 
Agile Analytics: Delivering on Promises by Atif Abdul Rahman
Agile Analytics: Delivering on Promises by Atif Abdul RahmanAgile Analytics: Delivering on Promises by Atif Abdul Rahman
Agile Analytics: Delivering on Promises by Atif Abdul RahmanAgile ME
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningGDSCIIITDHARWAD
 
Explore ML with Crowdsource | ML Extended - Session 4
Explore ML with Crowdsource | ML Extended - Session 4Explore ML with Crowdsource | ML Extended - Session 4
Explore ML with Crowdsource | ML Extended - Session 4SadhanaParameswaran
 
BigMLSchool: Customer Segmentation
BigMLSchool: Customer SegmentationBigMLSchool: Customer Segmentation
BigMLSchool: Customer SegmentationBigML, Inc
 
Operationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the EnterpriseOperationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the Enterprisemark madsen
 

Semelhante a MLSEV. Machine Learning: Technical Perspective (20)

DutchMLSchool. ML Business Perspective
DutchMLSchool. ML Business PerspectiveDutchMLSchool. ML Business Perspective
DutchMLSchool. ML Business Perspective
 
From DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transitionFrom DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transition
 
BSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, EvaluationsBSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, Evaluations
 
BSSML16 L1. Introduction, Models, and Evaluations
BSSML16 L1. Introduction, Models, and EvaluationsBSSML16 L1. Introduction, Models, and Evaluations
BSSML16 L1. Introduction, Models, and Evaluations
 
DutchMLSchool. Introduction to Machine Learning with the BigML Platform
DutchMLSchool. Introduction to Machine Learning with the BigML PlatformDutchMLSchool. Introduction to Machine Learning with the BigML Platform
DutchMLSchool. Introduction to Machine Learning with the BigML Platform
 
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tDefcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
 
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
 
Myth vs Reality: Understanding AI/ML for QA Automation - w/ Jonathan Lipps
Myth vs Reality: Understanding AI/ML for QA Automation - w/ Jonathan LippsMyth vs Reality: Understanding AI/ML for QA Automation - w/ Jonathan Lipps
Myth vs Reality: Understanding AI/ML for QA Automation - w/ Jonathan Lipps
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
 
Ezml Stanford 2015
Ezml Stanford 2015Ezml Stanford 2015
Ezml Stanford 2015
 
A few Challenges to Make Machine Learning Easy
A few Challenges to Make Machine Learning EasyA few Challenges to Make Machine Learning Easy
A few Challenges to Make Machine Learning Easy
 
DutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLDutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End ML
 
MLSEV. Automating Decision Making
MLSEV. Automating Decision MakingMLSEV. Automating Decision Making
MLSEV. Automating Decision Making
 
Agile Analytics: Delivering on Promises by Atif Abdul Rahman
Agile Analytics: Delivering on Promises by Atif Abdul RahmanAgile Analytics: Delivering on Promises by Atif Abdul Rahman
Agile Analytics: Delivering on Promises by Atif Abdul Rahman
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Explore ML with Crowdsource | ML Extended - Session 4
Explore ML with Crowdsource | ML Extended - Session 4Explore ML with Crowdsource | ML Extended - Session 4
Explore ML with Crowdsource | ML Extended - Session 4
 
AI
AIAI
AI
 
BigMLSchool: Customer Segmentation
BigMLSchool: Customer SegmentationBigMLSchool: Customer Segmentation
BigMLSchool: Customer Segmentation
 
Agile Analytics
Agile AnalyticsAgile Analytics
Agile Analytics
 
Operationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the EnterpriseOperationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the Enterprise
 

Mais de BigML, Inc

Digital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingDigital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingBigML, Inc
 
DutchMLSchool 2022 - Automation
DutchMLSchool 2022 - AutomationDutchMLSchool 2022 - Automation
DutchMLSchool 2022 - AutomationBigML, Inc
 
DutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML ComplianceDutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML ComplianceBigML, Inc
 
DutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective AnomaliesDutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective AnomaliesBigML, Inc
 
DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector BigML, Inc
 
DutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly DetectionDutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly DetectionBigML, Inc
 
DutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLDutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLBigML, Inc
 
DutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven CompanyDutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven CompanyBigML, Inc
 
DutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal SectorDutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal SectorBigML, Inc
 
DutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe StadiumsDutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe StadiumsBigML, Inc
 
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsDutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsBigML, Inc
 
DutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at ScaleDutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at ScaleBigML, Inc
 
DutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AIDutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AIBigML, Inc
 
Democratizing Object Detection
Democratizing Object DetectionDemocratizing Object Detection
Democratizing Object DetectionBigML, Inc
 
BigML Release: Image Processing
BigML Release: Image ProcessingBigML Release: Image Processing
BigML Release: Image ProcessingBigML, Inc
 
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureMachine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureBigML, Inc
 
Machine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail SectorMachine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail SectorBigML, Inc
 
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotBigML, Inc
 
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...BigML, Inc
 
ML in GRC: Cybersecurity versus Governance, Risk Management, and Compliance
ML in GRC: Cybersecurity versus Governance, Risk Management, and ComplianceML in GRC: Cybersecurity versus Governance, Risk Management, and Compliance
ML in GRC: Cybersecurity versus Governance, Risk Management, and ComplianceBigML, Inc
 

Mais de BigML, Inc (20)

Digital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingDigital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in Manufacturing
 
DutchMLSchool 2022 - Automation
DutchMLSchool 2022 - AutomationDutchMLSchool 2022 - Automation
DutchMLSchool 2022 - Automation
 
DutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML ComplianceDutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML Compliance
 
DutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective AnomaliesDutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective Anomalies
 
DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector
 
DutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly DetectionDutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly Detection
 
DutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLDutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in ML
 
DutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven CompanyDutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven Company
 
DutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal SectorDutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal Sector
 
DutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe StadiumsDutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe Stadiums
 
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsDutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
 
DutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at ScaleDutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at Scale
 
DutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AIDutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AI
 
Democratizing Object Detection
Democratizing Object DetectionDemocratizing Object Detection
Democratizing Object Detection
 
BigML Release: Image Processing
BigML Release: Image ProcessingBigML Release: Image Processing
BigML Release: Image Processing
 
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureMachine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
 
Machine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail SectorMachine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail Sector
 
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
 
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
 
ML in GRC: Cybersecurity versus Governance, Risk Management, and Compliance
ML in GRC: Cybersecurity versus Governance, Risk Management, and ComplianceML in GRC: Cybersecurity versus Governance, Risk Management, and Compliance
ML in GRC: Cybersecurity versus Governance, Risk Management, and Compliance
 

Último

定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 

Último (20)

Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 

MLSEV. Machine Learning: Technical Perspective

  • 2. BigML, Inc ML: Technical Perspective What is the Big Deal? Poul Petersen CIO, BigML !2
  • 3. BigML, Inc #MLSEV: ML a Technical Perspective Sampling the Audience !3 Expert: Published papers at KDD, ICML, NIPS, etc or developed own ML algorithms used at large scale Aficionado: Understands pros/cons of different techniques and/or can tweak algorithms as needed Practitioner: Very familiar with ML packages (Weka, Scikit, BigML, etc.) Newbie: Just taking Coursera ML class or reading an introductory book to ML Absolute beginner: ML sounds like science fiction
  • 4. BigML, Inc #MLSEV: ML a Technical Perspective A Present for You !4
  • 5. BigML, Inc #MLSEV: ML a Technical Perspective Free 1-Month Boosted Subscription !5 https://bigml.com/accounts/register/ MLSEV
  • 6. BigML, Inc #MLSEV What is Machine Learning? !6
  • 7. BigML, Inc #MLSEV: ML a Technical Perspective What is Machine Learning? !7 Let’s start with what is NOT Machine Learning… • Sentience • Killer robots • Generalized Artificial Intelligence • Anything to do with the word “singularity”
  • 8. BigML, Inc #MLSEV: ML a Technical Perspective Oh the Hype! !8 AlphaGo Zero beats a human at Go… killer robots far off? • First of all, AlphaGo Zero is impressive! • But, no need to fear killer robots power by AlphaGo Zero: • Learning is not transferrable: retrain for chess, etc. • Works only for rule based systems / perfect simulator • Relies on games/systems with clear objectives (win/lose) • Cost $25 million1 “While AlphaGo Zero is a step towards a general-purpose AI, it can only work on problems that can be perfectly simulated in a computer, making tasks such as driving a car out of the question. AIs that match humans at a huge range of tasks are still a long way off” - Demis Hassabis, CEO of DeepMind2 2. https://www.theguardian.com/science/2017/oct/18/its-able-to-create-knowledge-itself-google-unveils-ai-learns-all-on-its-own 1. https://www.inc.com/lisa-calhoun/google-artificial-intelligence-alpha-go-zero-just-pressed-reset-on-how-we-learn.html
  • 9. BigML, Inc #MLSEV: ML a Technical Perspective Three Domains !9 Artificial Intelligence Cool/Scary things… that mostly don’t exist Machine Learning AI Concepts applied to very specific problems Deep Learning Specific techniques of Machine Learning
  • 10. BigML, Inc #MLSEV: ML a Technical Perspective What is Machine Learning? !10 Let’s start with what is NOT Machine Learning… • Sentience • Killer robots • Generalized Artificial Intelligence • Anything to do with the word “singularity” • Something “new” • First International Conference on ML held in 1980 • Top-performing algorithms have been around for decades How do these things relate?
  • 11. BigML, Inc #MLSEV: ML a Technical Perspective AIRLINE ORIGIN DESTINATION DEPARTURE DELAY DISTANCE ARRIVAL DELAY AS ANC SEA -11 1448,0 -22 AA LAX PBI -8 2330,0 -9 US SFO CLT -2 2296,0 5 AA LAX MIA -5 2342,0 -9 AS SEA ANC -1 1448,0 -21 DL SFO MSP -5 1589 8 NK LAS MSP -6 1299 -17 US LAX CLT 14 2125,0 -10 AA SFO DFW -11 1464,0 -13 DL LAS ATL 3 1747,0 -15 What is Machine Learning? !11 Finding patterns in data that can be used to make inferences Predictive Models A practical definition…
  • 12. BigML, Inc #MLSEV: ML a Technical Perspective Machine Learning Terminology !12 Instances Features New Instance Predictive model Prediction Confidence ML algorithm Label Training / Learning Predicting / Scoring Data
  • 13. BigML, Inc #MLSEV Why Machine Learning? !13
  • 14. BigML, Inc #MLSEV: ML a Technical Perspective Why Machine Learning !14 COMPLEXITYOFTASKS TIME20th century 21st century - +
  • 15. BigML, Inc #MLSEV: ML a Technical Perspective Traditional Programming !15 Lost Baggage Policy • Explicit rules defined by requirements and experience • How do we program when the rules are unknown or very difficult to determine?
  • 16. BigML, Inc #MLSEV: ML a Technical Perspective Programming with ML !16 AIRLINE ORIGIN DESTINATION DEPARTURE DELAY DISTANCE ARRIVAL DELAY AS ANC SEA -11 1448,0 -22 AA LAX PBI -8 2330,0 -9 US SFO CLT -2 2296,0 5 AA LAX MIA -5 2342,0 -9 AS SEA ANC -1 1448,0 -21 DL SFO MSP -5 1589 8 NK LAS MSP -6 1299 -17 US LAX CLT 14 2125,0 -10 AA SFO DFW -11 1464,0 -13 DL LAS ATL 3 1747,0 -15 Want: Flight Delay Prediction Flight Delay Model???? What else can ML do?
  • 17. BigML, Inc #MLSEV Machine Learning Tasks !17
  • 18. BigML, Inc #MLSEV: ML a Technical Perspective Machine Learning Tasks !18 CLUSTER ANALYSIS ANOMALY DETECTION ASSOCIATION DISCOVERY TOPIC MODELING TIME SERIES UNSUPERVISED CLASSIFICATION AND REGRESSION SUPERVISED
  • 19. BigML, Inc #MLSEV: ML a Technical Perspective Predictive Maintenance !19 CLASSIFICATION Will this component fail? REGRESSION How many days until this component fails? TIME SERIES FORECASTING How many components will fail in a week from now? CLUSTER ANALYSIS Which machines behave similarly? ANOMALY DETECTION Is this behavior normal? ASSOCIATION DISCOVERY What alerts are triggered together before a failure?
  • 20. BigML, Inc #MLSEV: ML a Technical Perspective Personalized Music !20 CLASSIFICATION Will this song be a hit? REGRESSION How many users will play this song next month? TIME SERIES FORECASTING How many downloads this song will have in 3 months? CLUSTER ANALYSIS Which songs are similar? ANOMALY DETECTION Is this song being played more than normal? ASSOCIATION DISCOVERY What songs people like to play together?
  • 21. BigML, Inc #MLSEV: ML a Technical Perspective Airline Revenue Management !21 CLASSIFICATION Will this flight be booked at 80% 14 days out? REGRESSION How many passengers will book this flight 7 days out? TIME SERIES FORECASTING How many tickets will be cancelled this week? CLUSTER ANALYSIS Which flight booking patterns are similar? ANOMALY DETECTION Are these flights booking patterns normal? ASSOCIATION DISCOVERY What price changes help overbook sooner?
  • 22. BigML, Inc #MLSEV: ML a Technical Perspective Network Security !22 CLASSIFICATION Is this email part of a phishing attack? REGRESSION How many logins after work per week? TIME SERIES FORECASTING What will be the number of false alarms next week? CLUSTER ANALYSIS Are these users behaving similarly? ANOMALY DETECTION Is this user behavior worth to inspect? ASSOCIATION DISCOVERY What alerts were triggered before this attack?
  • 23. BigML, Inc #MLSEV All ML Models are Wrong !23
  • 24. BigML, Inc #MLSEV: ML a Technical Perspective All ML Models are WRONG !24 TRUE FALSE DEEPNET ENSEMBLELOGISTIC REGRESION DECISION TREE Some model(s) is wrong… which one? Same patient… different models… different predictions! Insight: Need a way to measure model fitness
  • 25. BigML, Inc #MLSEV: ML a Technical Perspective Evaluating Models !25 TEST TRAINING CONFIDENCEPREDICTION % EVALUATION % ENSEMBLE PATIENT DATA Stay Tuned: You will see this in Evaluations
  • 26. BigML, Inc #MLSEV: ML a Technical Perspective Measuring ML Mistakes !26 TRUE FALSE TRUE TRUE POSITIVE FALSE POSITIVE FALSE FALSE NEGATIVE TRUE NEGATIVE MODEL ACTUAL We can bend the rules a bit…
  • 27. BigML, Inc #MLSEV: ML a Technical Perspective Operating Point !27 TRUE FALSE 100% 0% 0% 100% Operating Point More False Positives More False Negatives Why would you do this?
  • 28. BigML, Inc #MLSEV: ML a Technical Perspective Comparing Models !28 %TRUEPOSITIVES % FALSE POSITIVES WORST(?) MODEL IDEAL MODEL GOOD BETTER R AN D O M TRIVIAL MODEL TRIVIAL MODEL
  • 29. BigML, Inc #MLSEV: ML a Technical Perspective Mistakes can be Costly !29 + = FUN! DANGER!
  • 30. BigML, Inc #MLSEV: ML a Technical Perspective Cost Functions !30 GOOD BETTER?%TRUEPOSITIVES % FALSE POSITIVES • What is the cost of predicting cancer incorrectly? • What is the cost of labeling a fraudulent transaction as valid? • What is the cost of incorrectly predicting an aircraft part is safe? • Why can’t I just have a perfect model? FALSE NEGATIVE COST FALSE POSITIVE COST One possibility
  • 31. BigML, Inc #MLSEV: ML a Technical Perspective How it Goes All Wrong !31 • Over-fitting • Under-fitting
  • 32. BigML, Inc #MLSEV: ML a Technical Perspective Hunting Dog Image Classifier !32 TRU E FAL SE Which images are pictures of dogs that are bred to be hunters?
  • 33. BigML, Inc #MLSEV: ML a Technical Perspective Over-fitting… !33 “Hunting dogs are short- haired spotted puppies that lay out on the grass”
  • 34. BigML, Inc #MLSEV: ML a Technical Perspective Title !34 A perfect model! How about some new images… TRU E FAL SE
  • 35. BigML, Inc #MLSEV: ML a Technical Perspective Over-fitting !35 Model: true Reality: false Model: false Reality: true • This is an example or poor generalization • The model “fit” the training data perfectly • But it does not generalize to new instances well
  • 36. BigML, Inc #MLSEV: ML a Technical Perspective Under-fitting !36 “Dogs with drop or pendant ears are hunters” Only use ear shape:
  • 37. BigML, Inc #MLSEV: ML a Technical Perspective Title !37 An imperfect model… now we are making some mistakes on the training data. TRU E FAL SE
  • 38. BigML, Inc #MLSEV: ML a Technical Perspective Under-fitting !38 • This is an example of good generalization • The model “under-fit” the training data • But it is generalizing to new instances better Model: true Reality: true Model: false Reality: false
  • 39. BigML, Inc #MLSEV: ML a Technical Perspective Under-fitting !39 Model: false Reality: true Model: false Reality: true
  • 40. BigML, Inc #MLSEV: ML a Technical Perspective Learning Problems / Complexity !40 Under-fitting Over-fitting • High Complexity Model • Fitting the data too well One way to mitigate this is with different types of models… • Low Complexity Model • Not fitting the data very well
  • 41. BigML, Inc #MLSEV: ML a Technical Perspective Choosing the ML Algorithm !41 Decreasing Interpretability / Better Representation / Longer Training IncreasingDataSize/Complexity Early Stage Rapid Prototyping Mid Stage Proven Application Late Stage Critical Performance DeepnetsSingle Tree Model Logistic Regression Boosted Trees Random Decision Forest Decision Forest Hard?
  • 42. BigML, Inc #MLSEV Automating Machine Learning !42
  • 43. BigML, Inc #MLSEV: ML a Technical Perspective Deepnet Structure !43 x1 x2 x3 x4 y1 y2 y3Outputs Inputs h1 h2 h3 h4 h5 Hidden layer 3 Classes 4 Features h1 h2 h3 h4 h5 Hidden layer h1 h2 h3 h4 h9 Hidden layer…. h1 = activation?(wx, x) ?
  • 44. BigML, Inc #MLSEV: ML a Technical Perspective BigML Deepnet !44 • The success of a Deepnet is dependent on getting the right network structure for the dataset • But, there are too many parameters: • Nodes, layers, activation function, learning rate, etc… • And setting them takes significant expert knowledge • Solution: Metalearning (a good initial guess) • Solution: Network search (try a bunch)
  • 45. BigML, Inc #MLSEV: ML a Technical Perspective Automating Machine Learning !45 http://www.clparker.org/ml_benchmark/
  • 46. BigML, Inc #MLSEV: ML a Technical Perspective Automating Machine Learning !46 • Each resource has several parameters that impact quality • Number of trees, missing splits, nodes, weight • Rather than trial and error, we can use ML to find ideal parameters • Why not make the model type, Decision Tree, Boosted Tree, etc, a parameter as well? • Similar to Deepnet network search, but finds the optimum machine learning algorithm and parameters for your data automatically Key Insight: We can solve any parameter selection problem in a similar way.
  • 47. BigML, Inc #MLSEV: ML a Technical Perspective BigML OptiML !47
  • 48. BigML, Inc #MLSEV: ML a Technical Perspective Fusions !48 Key Insight: ML algorithms each have unique strengths and weaknesses Single Tree: output changes abruptly with inputs near decision boundary Tree + Deepnet: output changes smoothly with inputs near decision boundary
  • 49. BigML, Inc #MLSEV: ML a Technical Perspective Fusions !49 Model Skills: Some ML algorithms “generally” do better on some feature types: • RDF for sparse text vectors • LR/Deepnets for numeric features • Trees for categorical features Full Numeric Text
  • 50. BigML, Inc #MLSEV: ML a Technical Perspective Summary !50 • Machine Learning is a subset of “Artificial Intelligence” • Finds patterns in data that can be used to make inferences • Can be thought of as “programming with data” • Has been around for a long time (only recently practical) • Already being used to solve real-world problems • Caveat Emptor: • Machine Learning mistakes are expected • Care must be taken to address the cost of mistakes • Automating Machine Learning • Powerful application of ML to parameterizing ML • Models can be fused to address specific data complexities