SlideShare a Scribd company logo
1 of 35
Lecture 2.
Bayesian Decision Theory
Bayes Decision Rule
Loss function
Decision surface
Multivariate normal and Discriminant Function
Bayes Decision
It is the decision making when all underlying probability
distributions are known.
It is optimal given the distributions are known.
For two classes ω1 and ω2 ,
Prior probabilities for an unknown new observation:
P(ω1) : the new observation belongs to class 1
P(ω2) : the new observation belongs to class 2
P(ω1 ) + P(ω2 ) = 1
It reflects our prior knowledge. It is our decision rule
when no feature on the new object is available:
Classify as class 1 if P(ω1 ) > P(ω2 )
Bayes Decision
We observe features on each object.
P(x| ω1) & P(x| ω2) : class-specific density
The Bayes rule:
Bayes Decision
Likelihood of
observing x given
class label.
Bayes Decision
Posterior
probabilities.
Loss function
Loss function:
probability statement --> decision
some classification mistakes can be more costly than
others.
The set of c classes:
The set of possible actions:
: deciding that an observation belongs to
Loss when taking action i given the observation belongs to
hidden class j:
Loss function
The expected loss:
Given an observation with covariant vector x, the conditional
risk is:
Our final goal is to minimize the total risk over all x.
Loss function
The zero-one loss:
All errors are equally costly.
The conditional risk is:
“The risk corrsponding to this loss function is the average
probability error.”
R(αi | x)= λ(αi |ωj)P(ωj | x)
j=1
j=c
∑
= P(ωj | x)=1−P(ωi | x)
j≠i
∑
c,...,1j,i
ji1
ji0
),( ji =



≠
=
=ωαλ
Loss function
Let denote the loss for deciding class i
when the true class is j
In minimizing the risk, we decide class one if
Rearrange it, we have
Loss function
λλ θ
ω
ω
ωθ
ω
ω
λλ
λλ
>=
−
−
)|x(P
)|x(P
:ifdecidethen
)(P
)(P
.Let
2
1
1
1
2
1121
2212
λ =
0 1
1 0





,
then θλ =
P(ω2 )
P(ω1)
= θa
λ =
0 2
1 0






then θλ =
2P(ω2 )
P(ω1)
= θb
Example:
Loss function
Likelihood ratio.
Zero-one loss
function
If miss-
classifying ω2 is
penalized more:
Discriminant function & decision surface
Features -> discriminant functions gi(x), i=1,…,c
Assign class i if gi(x) > gj(x) ∀j ≠ i
Decision
surface
defined by
gi(x) = gj(x)
Decision surface
The discriminant functions help partition the feature space
into c decision regions (not necessarily contiguous). Our
interest is to estimate the boundaries between the regions.
Minimax
Minimizing the
maximum possible
loss.
What happens when
the priors change?
Normal density
Reminder: the covariance matrix is symmetric and
positive semidefinite.
Entropy - the measure of uncertainty
Normal distribution has the maximum entropy over all
distributions with a given mean and variance.
Reminder of some results for random vectors
Let Σ be a kxk square symmetrix matrix, then it has k pairs of
eigenvalues and eigenvectors. A can be decomposed as:
Σ=λ1e1e1
′+λ2e2e2
′+.......+λkekek
′=PΛ′P
Positive-definite matrix:
′xΣx >0,∀x ≠0
λ1 ≥λ2 ≥......≥λk >0
Note: ′xΣx =λ1( ′xe1)2
+......+λk( ′xek)2
Normal density
Whitening transform:
P : eigen vector matrix
Λ : diagonal eigen value matrix
Aw = PΛ
− 1
2
Aw
t
ΣAw
= Λ
− 1
2
Pt
ΣPΛ
− 1
2
= Λ
− 1
2
Pt
PΛPt
PΛ
− 1
2
= I
Σ=λ1e1e1
′+λ2e2e2
′+.......+λkekek
′=PΛ′P
Normal density
To make a minimum error rate classification (zero-one loss),
we use discriminant functions:
This is the log of the numerator in the Bayes formula. The
log posterior probability is proportional to it. Log is used
because we are only comparing the gi’s, and log is
monotone.
When normal density is assumed:
We have:
Discriminant function for normal density
(1)Σi = σ2
I
Linear discriminant
function:
Note: blue boxes –
irrelevant terms.
Discriminant function for normal density
The decision surface is where
With equal prior, x0 is the middle point between the two
means.
The decision surface is a hyperplane,perpendicular to the
line between the means.
Discriminant function for normal density
“Linear machine”: dicision surfaces are hyperplanes.
Discriminant function for normal density
With unequal prior
probabilities, the
decision boundary
shifts to the less likely
mean.
Discriminant function for normal density
(2) Σi = Σ
Discriminant function for normal density
Set:
The decision boundary is:
Discriminant function for normal density
The hyperplane is
generally not
perpendicular to the
line between the
means.
Discriminant function for normal density
(3) Σi is arbitrary
Decision boundary is hyperquadrics (hyperplanes, pairs of
hyperplanes, hyperspheres, hyperellipsoids, hyperparaboloids, hyperhyperboloids)
gi(x)= xt
Wix+wi
t
x+wi0
Wi =−
1
2
Σi
−1
wi =Σi
−1
µi
wi0 =−
1
2
µi
t
Σi
−1
µi −
1
2
lnΣi +lnP(ωi)
Discriminant function
for normal density
Discriminant
function for
normal density
Discriminant function for normal density
Extention to multi-class.
Discriminant function for discrete features
Discrete features: x = [x1, x2, …, xd ]t
, xi∈{0,1 }
pi = P(xi = 1 | ω1)
qi = P(xi = 1 | ω2)
The likelihood will be:
Discriminant function for discrete features
The discriminant function:
The likelihood ratio:
g(x) = wi
i=1
d
∑ xi + w0
wi = ln
pi(1−qi)
qi(1− pi)
i =1,...,d
w0 = ln
1− pi
1−qii=1
d
∑ + ln
P(ω1)
P(ω2)
Discriminant function for discrete features
So the decision surface is again a hyperplane.
Optimality
Consider a two-class case.
Two ways to make a mistake in the classification:
Misclassifying an observation from class 2 to class 1;
Misclassifying an observation from class 1 to class 2.
The feature space is partitioned into two regions by any
classifier: R1 and R2
Optimality
Optimality
In the multi-class case, there are numerous ways to make
mistakes. It is easier to calculate the probability of correct
classification.
Bayes classifier maximizes P(correct). Any other partitioning
will yield higher probability of error.
The result is not dependent on the form of the underlying
distributions.

More Related Content

What's hot

What's hot (20)

Pca ppt
Pca pptPca ppt
Pca ppt
 
Decision tree
Decision treeDecision tree
Decision tree
 
Linear discriminant analysis
Linear discriminant analysisLinear discriminant analysis
Linear discriminant analysis
 
3.5 model based clustering
3.5 model based clustering3.5 model based clustering
3.5 model based clustering
 
3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methods
 
Data Mining: Association Rules Basics
Data Mining: Association Rules BasicsData Mining: Association Rules Basics
Data Mining: Association Rules Basics
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 
Frequent itemset mining methods
Frequent itemset mining methodsFrequent itemset mining methods
Frequent itemset mining methods
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
 
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsData Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
 
Control Strategies in AI
Control Strategies in AIControl Strategies in AI
Control Strategies in AI
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
 
Machine Learning-Linear regression
Machine Learning-Linear regressionMachine Learning-Linear regression
Machine Learning-Linear regression
 
Instance based learning
Instance based learningInstance based learning
Instance based learning
 
Bagging.pptx
Bagging.pptxBagging.pptx
Bagging.pptx
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
 
Feature selection
Feature selectionFeature selection
Feature selection
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 
Tree pruning
 Tree pruning Tree pruning
Tree pruning
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
 

Viewers also liked

Bayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionBayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionAdnan Masood
 
DECISION THEORY WITH EXAMPLE
DECISION THEORY WITH EXAMPLEDECISION THEORY WITH EXAMPLE
DECISION THEORY WITH EXAMPLEAnasuya Barik
 
Pattern Recognition and its Applications
Pattern Recognition and its ApplicationsPattern Recognition and its Applications
Pattern Recognition and its ApplicationsSajida Mohammad
 
2012 mdsp pr07 bayes decision
2012 mdsp pr07 bayes decision2012 mdsp pr07 bayes decision
2012 mdsp pr07 bayes decisionnozomuhamada
 
Gamma function for different negative numbers and its applications
Gamma function for different negative numbers and its applicationsGamma function for different negative numbers and its applications
Gamma function for different negative numbers and its applicationsAlexander Decker
 
My ppt @becdoms on importance of business management
My ppt @becdoms on importance of business managementMy ppt @becdoms on importance of business management
My ppt @becdoms on importance of business managementBabasab Patil
 
Chapter 4: Decision theory and Bayesian analysis
Chapter 4: Decision theory and Bayesian analysisChapter 4: Decision theory and Bayesian analysis
Chapter 4: Decision theory and Bayesian analysisChristian Robert
 
FRM - Level 1 Part 2 - Quantitative Methods including Probability Theory
FRM - Level 1 Part 2 - Quantitative Methods including Probability TheoryFRM - Level 1 Part 2 - Quantitative Methods including Probability Theory
FRM - Level 1 Part 2 - Quantitative Methods including Probability TheoryJoe McPhail
 
pattern classification
pattern classificationpattern classification
pattern classificationRanjan Ganguli
 
Rectangular Coordinates, Introduction to Graphing Equations
Rectangular Coordinates, Introduction to Graphing EquationsRectangular Coordinates, Introduction to Graphing Equations
Rectangular Coordinates, Introduction to Graphing EquationsSandyPoinsett
 
What you Need to Know about Machine Learning?
What you Need to Know about Machine Learning?What you Need to Know about Machine Learning?
What you Need to Know about Machine Learning?ESRI Bulgaria
 
Gamma, Expoential, Poisson And Chi Squared Distributions
Gamma, Expoential, Poisson And Chi Squared DistributionsGamma, Expoential, Poisson And Chi Squared Distributions
Gamma, Expoential, Poisson And Chi Squared Distributionsmathscontent
 
Management functions and decision making
Management functions and decision makingManagement functions and decision making
Management functions and decision makingIva Walton
 

Viewers also liked (20)

Pattern recognition
Pattern recognitionPattern recognition
Pattern recognition
 
Decision Theory
Decision TheoryDecision Theory
Decision Theory
 
Bayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionBayesian Networks - A Brief Introduction
Bayesian Networks - A Brief Introduction
 
DECISION THEORY WITH EXAMPLE
DECISION THEORY WITH EXAMPLEDECISION THEORY WITH EXAMPLE
DECISION THEORY WITH EXAMPLE
 
Introduction to pattern recognition
Introduction to pattern recognitionIntroduction to pattern recognition
Introduction to pattern recognition
 
Pattern Recognition and its Applications
Pattern Recognition and its ApplicationsPattern Recognition and its Applications
Pattern Recognition and its Applications
 
2012 mdsp pr07 bayes decision
2012 mdsp pr07 bayes decision2012 mdsp pr07 bayes decision
2012 mdsp pr07 bayes decision
 
Gamma function for different negative numbers and its applications
Gamma function for different negative numbers and its applicationsGamma function for different negative numbers and its applications
Gamma function for different negative numbers and its applications
 
09 Unif Exp Gamma
09 Unif Exp Gamma09 Unif Exp Gamma
09 Unif Exp Gamma
 
My ppt @becdoms on importance of business management
My ppt @becdoms on importance of business managementMy ppt @becdoms on importance of business management
My ppt @becdoms on importance of business management
 
Chapter 4: Decision theory and Bayesian analysis
Chapter 4: Decision theory and Bayesian analysisChapter 4: Decision theory and Bayesian analysis
Chapter 4: Decision theory and Bayesian analysis
 
Pattern classification
Pattern classificationPattern classification
Pattern classification
 
FRM - Level 1 Part 2 - Quantitative Methods including Probability Theory
FRM - Level 1 Part 2 - Quantitative Methods including Probability TheoryFRM - Level 1 Part 2 - Quantitative Methods including Probability Theory
FRM - Level 1 Part 2 - Quantitative Methods including Probability Theory
 
pattern classification
pattern classificationpattern classification
pattern classification
 
Rectangular Coordinates, Introduction to Graphing Equations
Rectangular Coordinates, Introduction to Graphing EquationsRectangular Coordinates, Introduction to Graphing Equations
Rectangular Coordinates, Introduction to Graphing Equations
 
Decision theory
Decision theoryDecision theory
Decision theory
 
Probability
ProbabilityProbability
Probability
 
What you Need to Know about Machine Learning?
What you Need to Know about Machine Learning?What you Need to Know about Machine Learning?
What you Need to Know about Machine Learning?
 
Gamma, Expoential, Poisson And Chi Squared Distributions
Gamma, Expoential, Poisson And Chi Squared DistributionsGamma, Expoential, Poisson And Chi Squared Distributions
Gamma, Expoential, Poisson And Chi Squared Distributions
 
Management functions and decision making
Management functions and decision makingManagement functions and decision making
Management functions and decision making
 

Similar to Bayes Decision Theory and Discriminant Functions for Classification

The world of loss function
The world of loss functionThe world of loss function
The world of loss function홍배 김
 
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Beniamino Murgante
 
Options Portfolio Selection
Options Portfolio SelectionOptions Portfolio Selection
Options Portfolio Selectionguasoni
 
Regret-Based Reward Elicitation for Markov Decision Processes
Regret-Based Reward Elicitation for Markov Decision ProcessesRegret-Based Reward Elicitation for Markov Decision Processes
Regret-Based Reward Elicitation for Markov Decision ProcessesKevin Regan
 
Normal density and discreminant analysis
Normal density and discreminant analysisNormal density and discreminant analysis
Normal density and discreminant analysisVARUN KUMAR
 
Deep Learning for Cyber Security
Deep Learning for Cyber SecurityDeep Learning for Cyber Security
Deep Learning for Cyber SecurityAltoros
 
MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1arogozhnikov
 
On Solving Covering Problems
On Solving Covering ProblemsOn Solving Covering Problems
On Solving Covering ProblemsOlivier Coudert
 
Randomness conductors
Randomness conductorsRandomness conductors
Randomness conductorswtyru1989
 
Estimation and Prediction of Complex Systems: Progress in Weather and Climate
Estimation and Prediction of Complex Systems: Progress in Weather and ClimateEstimation and Prediction of Complex Systems: Progress in Weather and Climate
Estimation and Prediction of Complex Systems: Progress in Weather and Climatemodons
 
Jörg Stelzer
Jörg StelzerJörg Stelzer
Jörg Stelzerbutest
 

Similar to Bayes Decision Theory and Discriminant Functions for Classification (20)

The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
 
Bayes ML.ppt
Bayes ML.pptBayes ML.ppt
Bayes ML.ppt
 
Options Portfolio Selection
Options Portfolio SelectionOptions Portfolio Selection
Options Portfolio Selection
 
Regret-Based Reward Elicitation for Markov Decision Processes
Regret-Based Reward Elicitation for Markov Decision ProcessesRegret-Based Reward Elicitation for Markov Decision Processes
Regret-Based Reward Elicitation for Markov Decision Processes
 
Pres metabief2020jmm
Pres metabief2020jmmPres metabief2020jmm
Pres metabief2020jmm
 
CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Normal density and discreminant analysis
Normal density and discreminant analysisNormal density and discreminant analysis
Normal density and discreminant analysis
 
Deep Learning for Cyber Security
Deep Learning for Cyber SecurityDeep Learning for Cyber Security
Deep Learning for Cyber Security
 
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
 
Optimization tutorial
Optimization tutorialOptimization tutorial
Optimization tutorial
 
MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1
 
On Solving Covering Problems
On Solving Covering ProblemsOn Solving Covering Problems
On Solving Covering Problems
 
CI_L01_Optimization.pdf
CI_L01_Optimization.pdfCI_L01_Optimization.pdf
CI_L01_Optimization.pdf
 
support vector machine
support vector machinesupport vector machine
support vector machine
 
Randomness conductors
Randomness conductorsRandomness conductors
Randomness conductors
 
Pr1
Pr1Pr1
Pr1
 
Estimation and Prediction of Complex Systems: Progress in Weather and Climate
Estimation and Prediction of Complex Systems: Progress in Weather and ClimateEstimation and Prediction of Complex Systems: Progress in Weather and Climate
Estimation and Prediction of Complex Systems: Progress in Weather and Climate
 
Jörg Stelzer
Jörg StelzerJörg Stelzer
Jörg Stelzer
 

Recently uploaded

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Bayes Decision Theory and Discriminant Functions for Classification

  • 1. Lecture 2. Bayesian Decision Theory Bayes Decision Rule Loss function Decision surface Multivariate normal and Discriminant Function
  • 2. Bayes Decision It is the decision making when all underlying probability distributions are known. It is optimal given the distributions are known. For two classes ω1 and ω2 , Prior probabilities for an unknown new observation: P(ω1) : the new observation belongs to class 1 P(ω2) : the new observation belongs to class 2 P(ω1 ) + P(ω2 ) = 1 It reflects our prior knowledge. It is our decision rule when no feature on the new object is available: Classify as class 1 if P(ω1 ) > P(ω2 )
  • 3. Bayes Decision We observe features on each object. P(x| ω1) & P(x| ω2) : class-specific density The Bayes rule:
  • 6. Loss function Loss function: probability statement --> decision some classification mistakes can be more costly than others. The set of c classes: The set of possible actions: : deciding that an observation belongs to Loss when taking action i given the observation belongs to hidden class j:
  • 7. Loss function The expected loss: Given an observation with covariant vector x, the conditional risk is: Our final goal is to minimize the total risk over all x.
  • 8. Loss function The zero-one loss: All errors are equally costly. The conditional risk is: “The risk corrsponding to this loss function is the average probability error.” R(αi | x)= λ(αi |ωj)P(ωj | x) j=1 j=c ∑ = P(ωj | x)=1−P(ωi | x) j≠i ∑ c,...,1j,i ji1 ji0 ),( ji =    ≠ = =ωαλ
  • 9. Loss function Let denote the loss for deciding class i when the true class is j In minimizing the risk, we decide class one if Rearrange it, we have
  • 10. Loss function λλ θ ω ω ωθ ω ω λλ λλ >= − − )|x(P )|x(P :ifdecidethen )(P )(P .Let 2 1 1 1 2 1121 2212 λ = 0 1 1 0      , then θλ = P(ω2 ) P(ω1) = θa λ = 0 2 1 0       then θλ = 2P(ω2 ) P(ω1) = θb Example:
  • 11. Loss function Likelihood ratio. Zero-one loss function If miss- classifying ω2 is penalized more:
  • 12. Discriminant function & decision surface Features -> discriminant functions gi(x), i=1,…,c Assign class i if gi(x) > gj(x) ∀j ≠ i Decision surface defined by gi(x) = gj(x)
  • 13. Decision surface The discriminant functions help partition the feature space into c decision regions (not necessarily contiguous). Our interest is to estimate the boundaries between the regions.
  • 14. Minimax Minimizing the maximum possible loss. What happens when the priors change?
  • 15. Normal density Reminder: the covariance matrix is symmetric and positive semidefinite. Entropy - the measure of uncertainty Normal distribution has the maximum entropy over all distributions with a given mean and variance.
  • 16. Reminder of some results for random vectors Let Σ be a kxk square symmetrix matrix, then it has k pairs of eigenvalues and eigenvectors. A can be decomposed as: Σ=λ1e1e1 ′+λ2e2e2 ′+.......+λkekek ′=PΛ′P Positive-definite matrix: ′xΣx >0,∀x ≠0 λ1 ≥λ2 ≥......≥λk >0 Note: ′xΣx =λ1( ′xe1)2 +......+λk( ′xek)2
  • 17. Normal density Whitening transform: P : eigen vector matrix Λ : diagonal eigen value matrix Aw = PΛ − 1 2 Aw t ΣAw = Λ − 1 2 Pt ΣPΛ − 1 2 = Λ − 1 2 Pt PΛPt PΛ − 1 2 = I Σ=λ1e1e1 ′+λ2e2e2 ′+.......+λkekek ′=PΛ′P
  • 18. Normal density To make a minimum error rate classification (zero-one loss), we use discriminant functions: This is the log of the numerator in the Bayes formula. The log posterior probability is proportional to it. Log is used because we are only comparing the gi’s, and log is monotone. When normal density is assumed: We have:
  • 19. Discriminant function for normal density (1)Σi = σ2 I Linear discriminant function: Note: blue boxes – irrelevant terms.
  • 20. Discriminant function for normal density The decision surface is where With equal prior, x0 is the middle point between the two means. The decision surface is a hyperplane,perpendicular to the line between the means.
  • 21. Discriminant function for normal density “Linear machine”: dicision surfaces are hyperplanes.
  • 22. Discriminant function for normal density With unequal prior probabilities, the decision boundary shifts to the less likely mean.
  • 23. Discriminant function for normal density (2) Σi = Σ
  • 24. Discriminant function for normal density Set: The decision boundary is:
  • 25. Discriminant function for normal density The hyperplane is generally not perpendicular to the line between the means.
  • 26. Discriminant function for normal density (3) Σi is arbitrary Decision boundary is hyperquadrics (hyperplanes, pairs of hyperplanes, hyperspheres, hyperellipsoids, hyperparaboloids, hyperhyperboloids) gi(x)= xt Wix+wi t x+wi0 Wi =− 1 2 Σi −1 wi =Σi −1 µi wi0 =− 1 2 µi t Σi −1 µi − 1 2 lnΣi +lnP(ωi)
  • 29. Discriminant function for normal density Extention to multi-class.
  • 30. Discriminant function for discrete features Discrete features: x = [x1, x2, …, xd ]t , xi∈{0,1 } pi = P(xi = 1 | ω1) qi = P(xi = 1 | ω2) The likelihood will be:
  • 31. Discriminant function for discrete features The discriminant function: The likelihood ratio:
  • 32. g(x) = wi i=1 d ∑ xi + w0 wi = ln pi(1−qi) qi(1− pi) i =1,...,d w0 = ln 1− pi 1−qii=1 d ∑ + ln P(ω1) P(ω2) Discriminant function for discrete features So the decision surface is again a hyperplane.
  • 33. Optimality Consider a two-class case. Two ways to make a mistake in the classification: Misclassifying an observation from class 2 to class 1; Misclassifying an observation from class 1 to class 2. The feature space is partitioned into two regions by any classifier: R1 and R2
  • 35. Optimality In the multi-class case, there are numerous ways to make mistakes. It is easier to calculate the probability of correct classification. Bayes classifier maximizes P(correct). Any other partitioning will yield higher probability of error. The result is not dependent on the form of the underlying distributions.