SlideShare uma empresa Scribd logo
1 de 41
Baixar para ler offline
classification edition
Machine Learning in 5 Minutes
Brian Lange
hi, i’m a
data scientist
classification
algorithms
popular examples
-spam filters
-the Sorting Hat
things to know
- you need data labeled with the correct answers to
“train” these algorithms before they work
- feature = dimension = attribute of the data
- class = category = Harry Potter house
linear discriminants
“draw a line through it”
linear discriminants
“draw a line through it”
linear discriminants
“draw a line through it”
linear discriminants
“draw a line through it”
🎉
define what “shitty” means
6 wrong
define what “shitty” means
4 wrong
a map of shittiness
to find the least shitty line
shittiness
slope
intercept
probably don’t use these
linear discriminants:
logistic regression
“divide it with a log function”
logistic regression
“divide it with a log function”
🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉
+ gives you probabilities
+ the model is a formula
+ can “threshold” to make model more or less
conservative
💩💩💩💩💩💩💩💩💩💩💩
- only works with linear decision boundaries
SVMs (support vector machines)
“*advanced* draw a line through it”
- better definition of “shitty”
- lines can turn into non-linear
shapes if you transform your
data
💩
💩
“the kernel trick”
🎉
woooooooooooo
🎉🎉
SVMs (support vector machines)
“*advanced* draw a line through it”
SVMs (support vector machines)
“*advanced* draw a line through it”
🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉
works well on a lot of different shapes of data
thanks to the kernel trick
💩💩💩💩💩💩💩💩💩💩💩
not super easy to explain to people
can only kinda do probabilities
KNN (k-nearest neighbors)
“what do similar cases look like?”
KNN (k-nearest neighbors)
“what do similar cases look like?”
k=1
KNN (k-nearest neighbors)
“what do similar cases look like?”
k=2
KNN (k-nearest neighbors)
“what do similar cases look like?”
k=1
KNN (k-nearest neighbors)
“what do similar cases look like?”
k=2
KNN (k-nearest neighbors)
“what do similar cases look like?”
k=3
KNN (k-nearest neighbors)
“what do similar cases look like?”
KNN (k-nearest neighbors)
“what do similar cases look like?”
🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉
+ no training, adding new data is easy
+ you get to define “distance” 

💩💩💩💩💩💩💩💩💩💩💩
- can be outlier-sensitive
- you have to define “distance”
decision tree learners
make a flow chart of it
decision tree learners
make a flow chart of it
x < 3?
yes no
3
decision tree learners
make a flow chart of it
x < 3?
yes no
y < 4?
yes no
3
4
decision tree learners
make a flow chart of it
x < 3?
yes no
y < 4?
yes no
x < 5?
yes no
3 5
4
decision tree learners
make a flow chart of it
🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉
+ fit all kinds of arbitrary shapes
+ output is a clear set of
conditionals

💩💩💩💩💩💩💩💩💩💩💩
- extremely prone to overfitting
- have to rebuild when you get new
data
- no probability estimates
ensemble models
make a bunch of models and combine them
ensemble models
make a bunch of models and combine them
🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉
- don’t overfit as much as their component parts
- Generally don’t require much parameter tweaking
- If data doesn’t change very often, you can make
them semi-online by just adding new trees
- Can provide probabilities
💩💩💩💩💩💩💩💩💩💩💩
- Slower than their component parts (though if
those are fast, it doesn’t matter)

Mais conteúdo relacionado

Mais procurados

machine learning algorithm.pptx
machine learning algorithm.pptxmachine learning algorithm.pptx
machine learning algorithm.pptxSasmitaDash28
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series dataKrish_ver2
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and predictionDataminingTools Inc
 
Genetic algorithms in Data Mining
Genetic algorithms in Data MiningGenetic algorithms in Data Mining
Genetic algorithms in Data MiningAtul Khanna
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Simplilearn
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsMd. Main Uddin Rony
 
Density Based Clustering
Density Based ClusteringDensity Based Clustering
Density Based ClusteringSSA KPI
 
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis Salah Amean
 
Machine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And ApplicationsMachine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And ApplicationsSlideTeam
 
support vector machine 1.pptx
support vector machine 1.pptxsupport vector machine 1.pptx
support vector machine 1.pptxsurbhidutta4
 
Data Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model SelectionData Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model SelectionDerek Kane
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learningHaris Jamil
 
Clustering &amp; classification
Clustering &amp; classificationClustering &amp; classification
Clustering &amp; classificationJamshed Khan
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streamsKrish_ver2
 
Topological Sorting
Topological SortingTopological Sorting
Topological SortingShahDhruv21
 
Design and analysis of algorithms
Design and analysis of algorithmsDesign and analysis of algorithms
Design and analysis of algorithmsDr Geetha Mohan
 
Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning Usama Fayyaz
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision treeKrish_ver2
 

Mais procurados (20)

machine learning algorithm.pptx
machine learning algorithm.pptxmachine learning algorithm.pptx
machine learning algorithm.pptx
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series data
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and prediction
 
Genetic algorithms in Data Mining
Genetic algorithms in Data MiningGenetic algorithms in Data Mining
Genetic algorithms in Data Mining
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
Density Based Clustering
Density Based ClusteringDensity Based Clustering
Density Based Clustering
 
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
 
Machine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And ApplicationsMachine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And Applications
 
support vector machine 1.pptx
support vector machine 1.pptxsupport vector machine 1.pptx
support vector machine 1.pptx
 
Data Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model SelectionData Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model Selection
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
Clustering &amp; classification
Clustering &amp; classificationClustering &amp; classification
Clustering &amp; classification
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streams
 
Lect12 graph mining
Lect12 graph miningLect12 graph mining
Lect12 graph mining
 
Topological Sorting
Topological SortingTopological Sorting
Topological Sorting
 
Metaheuristics
MetaheuristicsMetaheuristics
Metaheuristics
 
Design and analysis of algorithms
Design and analysis of algorithmsDesign and analysis of algorithms
Design and analysis of algorithms
 
Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
 

Semelhante a Machine Learning in 5 Minutes— Classification

It's Not Magic - Explaining classification algorithms
It's Not Magic - Explaining classification algorithmsIt's Not Magic - Explaining classification algorithms
It's Not Magic - Explaining classification algorithmsBrian Lange
 
A Scala Corrections Library
A Scala Corrections LibraryA Scala Corrections Library
A Scala Corrections LibraryPaul Phillips
 
Barga Data Science lecture 7
Barga Data Science lecture 7Barga Data Science lecture 7
Barga Data Science lecture 7Roger Barga
 
Understanding Basics of Machine Learning
Understanding Basics of Machine LearningUnderstanding Basics of Machine Learning
Understanding Basics of Machine LearningPranav Ainavolu
 
Efficient Simplification: The (im)possibilities
Efficient Simplification: The (im)possibilitiesEfficient Simplification: The (im)possibilities
Efficient Simplification: The (im)possibilitiesNeeldhara Misra
 
Geneticalgorithms 100403002207-phpapp02
Geneticalgorithms 100403002207-phpapp02Geneticalgorithms 100403002207-phpapp02
Geneticalgorithms 100403002207-phpapp02Amna Saeed
 
Declarative Syntax Definition - Pretty Printing
Declarative Syntax Definition - Pretty PrintingDeclarative Syntax Definition - Pretty Printing
Declarative Syntax Definition - Pretty PrintingGuido Wachsmuth
 
07-Classification.pptx
07-Classification.pptx07-Classification.pptx
07-Classification.pptxShree Shree
 
An Introduction To Python - Variables, Math
An Introduction To Python - Variables, MathAn Introduction To Python - Variables, Math
An Introduction To Python - Variables, MathBlue Elephant Consulting
 
DutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time SeriesDutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time SeriesBigML, Inc
 
What's "For Free" on Craigslist?
What's "For Free" on Craigslist? What's "For Free" on Craigslist?
What's "For Free" on Craigslist? Josh Mayer
 

Semelhante a Machine Learning in 5 Minutes— Classification (12)

It's Not Magic - Explaining classification algorithms
It's Not Magic - Explaining classification algorithmsIt's Not Magic - Explaining classification algorithms
It's Not Magic - Explaining classification algorithms
 
Mathematical reasoning
Mathematical reasoningMathematical reasoning
Mathematical reasoning
 
A Scala Corrections Library
A Scala Corrections LibraryA Scala Corrections Library
A Scala Corrections Library
 
Barga Data Science lecture 7
Barga Data Science lecture 7Barga Data Science lecture 7
Barga Data Science lecture 7
 
Understanding Basics of Machine Learning
Understanding Basics of Machine LearningUnderstanding Basics of Machine Learning
Understanding Basics of Machine Learning
 
Efficient Simplification: The (im)possibilities
Efficient Simplification: The (im)possibilitiesEfficient Simplification: The (im)possibilities
Efficient Simplification: The (im)possibilities
 
Geneticalgorithms 100403002207-phpapp02
Geneticalgorithms 100403002207-phpapp02Geneticalgorithms 100403002207-phpapp02
Geneticalgorithms 100403002207-phpapp02
 
Declarative Syntax Definition - Pretty Printing
Declarative Syntax Definition - Pretty PrintingDeclarative Syntax Definition - Pretty Printing
Declarative Syntax Definition - Pretty Printing
 
07-Classification.pptx
07-Classification.pptx07-Classification.pptx
07-Classification.pptx
 
An Introduction To Python - Variables, Math
An Introduction To Python - Variables, MathAn Introduction To Python - Variables, Math
An Introduction To Python - Variables, Math
 
DutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time SeriesDutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time Series
 
What's "For Free" on Craigslist?
What's "For Free" on Craigslist? What's "For Free" on Craigslist?
What's "For Free" on Craigslist?
 

Último

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 

Último (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Machine Learning in 5 Minutes— Classification