SlideShare a Scribd company logo
1 of 25
Introduction to
Machine Learning
(5 ECTS)
Giovanni Di Liberto
Asst. Prof. in Intelligent Systems, SCSS
Room G.15, O’Reilly Institute ©Trinity College Dublin
Trinity College Dublin, The University of Dublin
Overview previous lecture
2
• Inspecting the data
• Data visualisation
• Descriptive and inferential statistics
• Central limit theorem
• Correlation
Trinity College Dublin, The University of Dublin
Overview lecture
3
• More about correlation
• Classification – a bit of theory
• Binary classification
• Baseline
• Multiclass
• This is not in the test!
Trinity College Dublin, The University of Dublin 4
Why is the Normal distribution so important?
Trinity College Dublin, The University of Dublin 5
Central Limit Theorem
https://towardsdatascience.com/central-limit-theorem-a-real-life-application-f638657686e1
A core theorem for statistics and
statistical inference
Population
Subsamples
Let’s consider all the mean values within each subsamples = Sampling distribution
= distribution of the sample means. The CLT tells us that this is normally distributed
Sampling -> e.g., elections
Trinity College Dublin, The University of Dublin 6
Correlation coefficient
If when x is above its mean also y is above its mean, and vice versa, then the correlation is positive.
Example: The higher you go on a mountain, the colder it usually gets (negative correlation between
altitude and temperature)
Trinity College Dublin, The University of Dublin 7
Correlation coefficient
Pearson’s linear correlation Spearman’s rank correlation
x 2 5 4 1
y 3 1 6 3
x 1 2 4 5
y 3 3 6 1
Let’s sort x:
r 2 4 3 1
s 2 1 3 2
Trinity College Dublin, The University of Dublin 8
Correlation is NOT causation!
NEGATIVE CORRELATION (the x-axis was flipped here
Trinity College Dublin, The University of Dublin 9
Correlation script (see Blackboard week 4) –
let’s play with it
Trinity College Dublin, The University of Dublin 10
In brief
Correlation coefficient (r-value): How strong is that (linear) relationship
Correlation statistical significance (p-value): How confident are we that the correlation is not there by chance
(both the correlation strength and the number of data-points affect the p-value)
Trinity College Dublin, The University of Dublin
Classification
11
• Supervised learning
• Spam filter example
• Binary classification (2 classes: “spam” vs. “not spam”)
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
Trinity College Dublin, The University of Dublin
Classification
12
• What could our features be?
• Particular keywords (e.g., “Dear Respected Dr., greetings”)
• Particular senders
• Email sent simultaneously to thousands of people
• Those features on their own are insufficient (it is possible, not
necessarily spam, that an email is sent to thousands of people)
• Building a ML model based on all those features simultaneously would
be much better
• Combine them somehow. Some are better for the classification (higher
classification weight)
Trinity College Dublin, The University of Dublin
Classification
13
For example, using only one feature:
Number of possible spam keywords
nk
spam
Not spam
n > nk
n < nk
n
Email ID Number of spam
keywords
1 3
2 2
3 25
4 3
… …
Trinity College Dublin, The University of Dublin
Classification
14
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
For example, two features can be used simultaneously to identify a better classification boundary than with
any each feature individually
But how do we do that? ML!
Trinity College Dublin, The University of Dublin
Classification
15
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
Dataset Dataset
Class
1 0 0 0 0
Binary classification task:
- Is this a number five or not?
Class
5 0 4 1 9
Multiclass classification task:
- What digit is this?
Trinity College Dublin, The University of Dublin
Classification
16
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
Dataset
Class
1 0 0 0 0
Binary classification task:
- Is this a number five or not?
Features
- Each pixel is a feature.
- For example, 64x64 pixels would mean 4096 features.
- Each feature is a number from 0 to 1 (greyscale),
where 1 is white, 0 is black, and in-between there are
various shades of grey.
Trinity College Dublin, The University of Dublin
Classification
17
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
Class
1 - A LINEAR classifier determines whether
the class should be 1 or 0 by performing a
linear combination of all weights
- A linear combination is a weighted sum of
all features
- E.g., weight1*pixel1 + weight2*pixel2 + …
- The weights are chosen to maximise the
classification accuracy
Features
- Each pixel is a feature.
- For example, 64x64 pixels would mean 4096 features.
- Each feature is a number from 0 to 1 (greyscale),
where 1 is white, 0 is black, and in-between there are
various shades of grey.
How do we even plot 4096 features??
Trinity College Dublin, The University of Dublin
Classification
18
Hypothesis driven feature extraction? Average value in selected areas
Feature 1 Feature 2
Trinity College Dublin, The University of Dublin
Classification
19
Not a five
Five
Linear classification
boundary
Trinity College Dublin, The University of Dublin
Classification
20
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
X is the data matrix
(features)
y is the class (‘five’ or
‘not a five’)
Trinity College Dublin, The University of Dublin
Confusion matrix
21
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
Prediction Accuracy = (3+5)/(3+5+1+2) = 8/11 ~ 0.73
Trinity College Dublin, The University of Dublin
Confusion matrix
22
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
S = Sick
H = Healthy
Not very good for
diagnosis:
Ideally, almost
100% recall. We
don’t want to miss
diagnosing a person
that is sick.
H
H
H
H H
H
H
S S
S
S
S
S S
4 out of 7
4 out of 6
Trinity College Dublin, The University of Dublin
Confusion matrix
23
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
Prediction Accuracy = (3+5)/(3+5+1+2) = 8/11 ~ 0.73
S = Sick
H = Healthy
..better model
H
H
H
H H
H
H
S S
S
S
S
6 out of 7
4 out of 6
S
S
Trinity College Dublin, The University of Dublin 24
What can go wrong? What’s a bad prediction?
Predictive models make errors! We want to minimise the error rate, We have to decide what kind of
errors are acceptable. Do we prefer false-positive (e.g., a red-flag for a healthy person) or false-negative
(e.g., not detecting a person that is sick)?
The same question applies beyond classification. For example,
Overbooking system (e.g., Ryanair vs. American airlines).
Goal? To maximise profit!
Let’s simplify this the most that we can:
Profit ≈ Total revenue - costs
Profit ≈ Σ(ticketCost + onboardPurchases) – Σ(fixed costs + sum(costPerPassenger) + ?)
Profit ≈ Σ(ticketCost + onboardPurchases) – Σ(fixed costs + sum(costPerPassenger) + costVouchers)
Trade-off / optimise profit: Probability of error vs. probability of having empty seats
We are going to make mistakes and pay vouchers. The question is how many we want to make.
Trinity College Dublin, The University of Dublin 25
Profit ≈ Σ(ticketCost + onboardPurchases) – Σ(fixed costs + sum(costPerPassenger) + costVouchers)
In that case, the goal was to maximise the profit
We could have other goals e.g., minimise pollution, minimise time on the road for cars

More Related Content

Similar to IntroML_4_Classification

Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningAkshay Kanchan
 
Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014Claudia Wagner
 
Fairness in Machine Learning
Fairness in Machine LearningFairness in Machine Learning
Fairness in Machine LearningDelip Rao
 
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)Sherri Gunder
 
Analysing a Complex Agent-Based Model Using Data-Mining Techniques
Analysing a Complex Agent-Based Model  Using Data-Mining TechniquesAnalysing a Complex Agent-Based Model  Using Data-Mining Techniques
Analysing a Complex Agent-Based Model Using Data-Mining TechniquesBruce Edmonds
 
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Jonathan Stray
 
Geneticalgorithms 100403002207-phpapp02
Geneticalgorithms 100403002207-phpapp02Geneticalgorithms 100403002207-phpapp02
Geneticalgorithms 100403002207-phpapp02Amna Saeed
 
Mat 255 chapter 3 notes
Mat 255 chapter 3 notesMat 255 chapter 3 notes
Mat 255 chapter 3 notesadrushle
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Rich Heimann
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown BagDataTactics
 
Quant Data Analysis
Quant Data AnalysisQuant Data Analysis
Quant Data AnalysisSaad Chahine
 
Probability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional ExpertsProbability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional ExpertsChirag Gupta
 
Machine Learning Introduction.pptx
Machine Learning Introduction.pptxMachine Learning Introduction.pptx
Machine Learning Introduction.pptxJeeva Nantham
 
Weitao Duan - Creating economic opportunity for every linkedin member amid ne...
Weitao Duan - Creating economic opportunity for every linkedin member amid ne...Weitao Duan - Creating economic opportunity for every linkedin member amid ne...
Weitao Duan - Creating economic opportunity for every linkedin member amid ne...Weitao Duan
 
Supervised learning: Types of Machine Learning
Supervised learning: Types of Machine LearningSupervised learning: Types of Machine Learning
Supervised learning: Types of Machine LearningLibya Thomas
 

Similar to IntroML_4_Classification (20)

Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014
 
Fairness in Machine Learning
Fairness in Machine LearningFairness in Machine Learning
Fairness in Machine Learning
 
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
 
Observational studies in social media
Observational studies in social mediaObservational studies in social media
Observational studies in social media
 
Analysing a Complex Agent-Based Model Using Data-Mining Techniques
Analysing a Complex Agent-Based Model  Using Data-Mining TechniquesAnalysing a Complex Agent-Based Model  Using Data-Mining Techniques
Analysing a Complex Agent-Based Model Using Data-Mining Techniques
 
Genetic Algorithms
Genetic AlgorithmsGenetic Algorithms
Genetic Algorithms
 
Machine Learning
Machine Learning Machine Learning
Machine Learning
 
2015 genome-center
2015 genome-center2015 genome-center
2015 genome-center
 
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
 
Geneticalgorithms 100403002207-phpapp02
Geneticalgorithms 100403002207-phpapp02Geneticalgorithms 100403002207-phpapp02
Geneticalgorithms 100403002207-phpapp02
 
Mat 255 chapter 3 notes
Mat 255 chapter 3 notesMat 255 chapter 3 notes
Mat 255 chapter 3 notes
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown Bag
 
Quant Data Analysis
Quant Data AnalysisQuant Data Analysis
Quant Data Analysis
 
lecture1_.pdf
lecture1_.pdflecture1_.pdf
lecture1_.pdf
 
Probability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional ExpertsProbability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional Experts
 
Machine Learning Introduction.pptx
Machine Learning Introduction.pptxMachine Learning Introduction.pptx
Machine Learning Introduction.pptx
 
Weitao Duan - Creating economic opportunity for every linkedin member amid ne...
Weitao Duan - Creating economic opportunity for every linkedin member amid ne...Weitao Duan - Creating economic opportunity for every linkedin member amid ne...
Weitao Duan - Creating economic opportunity for every linkedin member amid ne...
 
Supervised learning: Types of Machine Learning
Supervised learning: Types of Machine LearningSupervised learning: Types of Machine Learning
Supervised learning: Types of Machine Learning
 

Recently uploaded

Multi Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleMulti Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleCeline George
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationdeepaannamalai16
 
Mental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsMental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsPooky Knightsmith
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxMan or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxDhatriParmar
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar
 

Recently uploaded (20)

Multi Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleMulti Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP Module
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentation
 
Mental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsMental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young minds
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxMan or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
 

IntroML_4_Classification

  • 1. Introduction to Machine Learning (5 ECTS) Giovanni Di Liberto Asst. Prof. in Intelligent Systems, SCSS Room G.15, O’Reilly Institute ©Trinity College Dublin
  • 2. Trinity College Dublin, The University of Dublin Overview previous lecture 2 • Inspecting the data • Data visualisation • Descriptive and inferential statistics • Central limit theorem • Correlation
  • 3. Trinity College Dublin, The University of Dublin Overview lecture 3 • More about correlation • Classification – a bit of theory • Binary classification • Baseline • Multiclass • This is not in the test!
  • 4. Trinity College Dublin, The University of Dublin 4 Why is the Normal distribution so important?
  • 5. Trinity College Dublin, The University of Dublin 5 Central Limit Theorem https://towardsdatascience.com/central-limit-theorem-a-real-life-application-f638657686e1 A core theorem for statistics and statistical inference Population Subsamples Let’s consider all the mean values within each subsamples = Sampling distribution = distribution of the sample means. The CLT tells us that this is normally distributed Sampling -> e.g., elections
  • 6. Trinity College Dublin, The University of Dublin 6 Correlation coefficient If when x is above its mean also y is above its mean, and vice versa, then the correlation is positive. Example: The higher you go on a mountain, the colder it usually gets (negative correlation between altitude and temperature)
  • 7. Trinity College Dublin, The University of Dublin 7 Correlation coefficient Pearson’s linear correlation Spearman’s rank correlation x 2 5 4 1 y 3 1 6 3 x 1 2 4 5 y 3 3 6 1 Let’s sort x: r 2 4 3 1 s 2 1 3 2
  • 8. Trinity College Dublin, The University of Dublin 8 Correlation is NOT causation! NEGATIVE CORRELATION (the x-axis was flipped here
  • 9. Trinity College Dublin, The University of Dublin 9 Correlation script (see Blackboard week 4) – let’s play with it
  • 10. Trinity College Dublin, The University of Dublin 10 In brief Correlation coefficient (r-value): How strong is that (linear) relationship Correlation statistical significance (p-value): How confident are we that the correlation is not there by chance (both the correlation strength and the number of data-points affect the p-value)
  • 11. Trinity College Dublin, The University of Dublin Classification 11 • Supervised learning • Spam filter example • Binary classification (2 classes: “spam” vs. “not spam”) “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, Aurélien Géron, 2019
  • 12. Trinity College Dublin, The University of Dublin Classification 12 • What could our features be? • Particular keywords (e.g., “Dear Respected Dr., greetings”) • Particular senders • Email sent simultaneously to thousands of people • Those features on their own are insufficient (it is possible, not necessarily spam, that an email is sent to thousands of people) • Building a ML model based on all those features simultaneously would be much better • Combine them somehow. Some are better for the classification (higher classification weight)
  • 13. Trinity College Dublin, The University of Dublin Classification 13 For example, using only one feature: Number of possible spam keywords nk spam Not spam n > nk n < nk n Email ID Number of spam keywords 1 3 2 2 3 25 4 3 … …
  • 14. Trinity College Dublin, The University of Dublin Classification 14 “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, Aurélien Géron, 2019 For example, two features can be used simultaneously to identify a better classification boundary than with any each feature individually But how do we do that? ML!
  • 15. Trinity College Dublin, The University of Dublin Classification 15 “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, Aurélien Géron, 2019 Dataset Dataset Class 1 0 0 0 0 Binary classification task: - Is this a number five or not? Class 5 0 4 1 9 Multiclass classification task: - What digit is this?
  • 16. Trinity College Dublin, The University of Dublin Classification 16 “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, Aurélien Géron, 2019 Dataset Class 1 0 0 0 0 Binary classification task: - Is this a number five or not? Features - Each pixel is a feature. - For example, 64x64 pixels would mean 4096 features. - Each feature is a number from 0 to 1 (greyscale), where 1 is white, 0 is black, and in-between there are various shades of grey.
  • 17. Trinity College Dublin, The University of Dublin Classification 17 “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, Aurélien Géron, 2019 Class 1 - A LINEAR classifier determines whether the class should be 1 or 0 by performing a linear combination of all weights - A linear combination is a weighted sum of all features - E.g., weight1*pixel1 + weight2*pixel2 + … - The weights are chosen to maximise the classification accuracy Features - Each pixel is a feature. - For example, 64x64 pixels would mean 4096 features. - Each feature is a number from 0 to 1 (greyscale), where 1 is white, 0 is black, and in-between there are various shades of grey. How do we even plot 4096 features??
  • 18. Trinity College Dublin, The University of Dublin Classification 18 Hypothesis driven feature extraction? Average value in selected areas Feature 1 Feature 2
  • 19. Trinity College Dublin, The University of Dublin Classification 19 Not a five Five Linear classification boundary
  • 20. Trinity College Dublin, The University of Dublin Classification 20 “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, Aurélien Géron, 2019 X is the data matrix (features) y is the class (‘five’ or ‘not a five’)
  • 21. Trinity College Dublin, The University of Dublin Confusion matrix 21 “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, Aurélien Géron, 2019 Prediction Accuracy = (3+5)/(3+5+1+2) = 8/11 ~ 0.73
  • 22. Trinity College Dublin, The University of Dublin Confusion matrix 22 “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, Aurélien Géron, 2019 S = Sick H = Healthy Not very good for diagnosis: Ideally, almost 100% recall. We don’t want to miss diagnosing a person that is sick. H H H H H H H S S S S S S S 4 out of 7 4 out of 6
  • 23. Trinity College Dublin, The University of Dublin Confusion matrix 23 “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, Aurélien Géron, 2019 Prediction Accuracy = (3+5)/(3+5+1+2) = 8/11 ~ 0.73 S = Sick H = Healthy ..better model H H H H H H H S S S S S 6 out of 7 4 out of 6 S S
  • 24. Trinity College Dublin, The University of Dublin 24 What can go wrong? What’s a bad prediction? Predictive models make errors! We want to minimise the error rate, We have to decide what kind of errors are acceptable. Do we prefer false-positive (e.g., a red-flag for a healthy person) or false-negative (e.g., not detecting a person that is sick)? The same question applies beyond classification. For example, Overbooking system (e.g., Ryanair vs. American airlines). Goal? To maximise profit! Let’s simplify this the most that we can: Profit ≈ Total revenue - costs Profit ≈ Σ(ticketCost + onboardPurchases) – Σ(fixed costs + sum(costPerPassenger) + ?) Profit ≈ Σ(ticketCost + onboardPurchases) – Σ(fixed costs + sum(costPerPassenger) + costVouchers) Trade-off / optimise profit: Probability of error vs. probability of having empty seats We are going to make mistakes and pay vouchers. The question is how many we want to make.
  • 25. Trinity College Dublin, The University of Dublin 25 Profit ≈ Σ(ticketCost + onboardPurchases) – Σ(fixed costs + sum(costPerPassenger) + costVouchers) In that case, the goal was to maximise the profit We could have other goals e.g., minimise pollution, minimise time on the road for cars