SlideShare a Scribd company logo
1 of 95
WELCOME TO MY
PRESENTATION
ON
STATISTICAL DISTANCE
Md. Menhazul Abedin
M.Sc. Student
Dept. of Statistics
Rajshahi University
Mob: 01751385142
Email: menhaz70@gmail.com
Objectives
• To know about the meaning of statistical
distance and it’s relation and difference with
general or Euclidean distance
Content
Definition of Euclidean distance
Concept & intuition of statistical distance
Definition of Statistical distance
Necessity of statistical distance
Concept of Mahalanobis distance (population
&sample)
Distribution of Mahalanobis distance
Mahalanobis distance in R
Acknowledgement
Euclidean Distance from origin
(0,0)
(X,Y)
X
Y
Euclidean Distance
P(X,Y)
Y
O (0,0) X
By Pythagoras
𝑑(𝑜, 𝑝) = 𝑋2 + 𝑌2
Euclidean Distance
Specific point
we see that two specific points in each picture
Our problem is to determine the length between
two points .
But how ??????????
Assume that these pictures are placed in two
dimensional spaces and points are joined by a
straight line
Let 1st point is (𝑥1,𝑦1) and 2nd point is (𝑥2, 𝑦2)
then distance is
D= √ ( (𝑥1−𝑥2)2
+ (𝑦1 − 𝑦2)2
)
What will be happen when dimension is three
Distanse in 𝑅3
Distance is given by
• Points are (x1,x2,x3) and (y1,y2,y3)
(𝑥1 − 𝑦1)2+(𝑥2 − 𝑦2)2+(𝑥3 − 𝑦3)2
For n dimension it can be written
as the following expression and
named as Euclidian distance
22
22
2
11
2121
)()()(),(
),,,(),,,,(
pp
pp
yxyxyxQPd
yyyQxxxP
 

12/12/2016 14
Properties of Euclidean Distance and
Mathematical Distance
• Usual human concept of distance is Eucl. Dist.
• Each coordinate contributes equally to the distance
22
22
2
11
2121
)()()(),(
),,,(),,,,(
pp
pp
yxyxyxQPd
yyyQxxxP
 

14
Mathematicians, generalizing its three properties ,
1) d(P,Q)=d(Q,P).
2) d(P,Q)=0 if and only if P=Q and
3) d(P,Q)=<d(P,R)+d(R,Q) for all R, define distance
on any set.
P(X1,Y1) Q(X2,Y2)
R(Z1,Z2))
R(Z1,Z2)
Taxicab Distance :NotionRed:
Manh
attan
distan
ce.
Green:
diagonal,
straight-
line
distance
Blue,
yello
w:
equiv
alent
Man
hatta
n
dista
nces.
• The Manhattan distance is the simple sum of
the horizontal and vertical components,
whereas
the diagonal distance might be computed by
applying the Pythagorean Theorem .
• Red: Manhattan distance.
• Green: diagonal, straight-line distance.
• Blue, yellow: equivalent Manhattan distances.
• Manhattan distance 12 unit
• Diagonal or straight-line distance or Euclidean
distance is 62 + 62 =6√2
We observe that Euclidean distance is less than
Manhattan distance
Taxicab/Manhattan distance :Definition
(p1,p2))
(q1,q2)
│𝑝1 − 𝑞2│
│p2-q2│
Manhattan Distance
• The taxicab distance between
(p1,p2) and (q1,q2)
is │p1-q1│+│p2-q2│
Relationship between Manhattan &
Euclidean distance.
7 Block
6 Block
Relationship between Manhattan &
Euclidean distance.
• It now seems that the distance from A to C is 7 blocks,
while the distance from A to B is 6 blocks.
• Unless we choose to go off-road, B is now closer to A
than C.
• Taxicab distance is sometimes equal to Euclidean
distance, but otherwise it is greater than Euclidean
distance.
Euclidean distance <Taxicab distance
Is it true always ???
Or for n dimension ???
Proof……..
Absolute values guarantee non-negative value
Addition property of inequality
Continued………..
Continued………..
For high dimension
• It holds for high dimensional case
• Σ │𝑥𝑖 − 𝑦𝑖│2
≤ Σ │𝑥𝑖 − 𝑦𝑖│2
+ 2Σ│𝑥𝑖 − 𝑥𝑖││𝑥𝑗 − 𝑥𝑗│
Which implies
Σ (𝑥𝑖 − 𝑦𝑖)2 ≤ Σ│𝑥𝑖 − 𝑥𝑗│
𝑑 𝐸 ≤ 𝑑 𝑇
12/12/2016
Statistical Distance
• Weight coordinates subject to a great deal of
variability less heavily than those that are not
highly variable
Whoisnearerto
datasetifitwere
point?
Same
distance from
origin
• Here
variability in x1 axis > variability in x2 axis
 Is the same distance meaningful from
origin ???
Ans: no
But, how we take into account the different
variability ????
Ans : Give different weights on axes.
12/12/2016
Statistical Distance for Uncorrelated Data
   
22
2
2
11
2
12*
2
2*
1
222
*
2111
*
1
21
),(
/,/
)0,0(),,(
s
x
s
x
xxPOd
sxxsxx
OxxP


weight
Standardization
all point that have coordinates (x1,x2) and
are a constant squared distance , c2
from the
origin must satisfy
𝑥12
𝑠11
+
𝑥22
𝑠22
=𝑐2
But … how to choose c ?????
It’s a problem
Choose c as 95% observation fall in this area ….
𝑠11 > 𝑠22
= >
1
𝑠11
<
1
𝑠22
12/12/2016
Ellipse of Constant Statistical Distance for
Uncorrelated Data
11sc 11sc
22sc
22sc
x1
x2
0
• This expression can be generalized as ………
statistical distance from an arbitrary point
P=(x1,x2) to any fixed point Q=(y1,y2)
;lk;lk;
For P dimension……………..
Remark :
1) The distance of P to the origin O is
obtain by setting all 𝑦𝑖 = 0
2) If all 𝑠𝑖𝑖 are equal Euclidean
distance formula is appropriate
Scattered Plot for
Correlated Measurements
• How do you measure the statistical distance of
the above data set ??????
• Ans : Firstly make it uncorrelated .
• But why and how………???????
• Ans: Rotate the axis keeping origin fixed.
12/12/2016
Scattered Plot for
Correlated Measurements
Rotation of axes keeping origin fixed
O M R
X1
N
Q
𝑥1
P(x1,x2)
x2
𝑥2
𝜃
𝜃
x=OM
=OR-MR
= 𝑥1 cos𝜃 – 𝑥2 sin𝜃 ……. (i)
y=MP
=QR+NP
= 𝑥1 sin𝜃 + 𝑥2 cos𝜃 ……….(ii)
• The solution of the above equations
Choice of 𝜃
What 𝜃 will you choice ?
How will you do it ?
 Data matrix → Centeralized data matrix → Covariance of
data matrix → Eigen vector
Theta = angle between 1st eigen vector and [1,0]
or
angle between 2nd eigen vector and [0,1]
Why is that angle between 1st eigen vector and
[0,1] or angle between 2nd eigen vector and [1,0]
??
Ans: Let B be a (p by p) positive definite matrix
with eigenvalues λ1≥λ2≥λ3≥ … … . . ≥ λp>0
and associated normalized eigenvectors
𝑒1, 𝑒2, … … … , 𝑒 𝑝.Then
𝑚𝑎𝑥 𝑥≠0
𝑥′ 𝐵𝑥
𝑥′ 𝑥
= λ1 attained when x= 𝑒1
𝑚𝑖𝑛 𝑥≠0
𝑥′ 𝐵𝑥
𝑥′ 𝑥
= λ 𝑝 attained when x= 𝑒 𝑝
𝑚𝑎𝑥 𝑥⊥𝑒1,𝑒2,…,𝑒 𝑘
𝑥′ 𝐵𝑥
𝑥′ 𝑥
= λ 𝑘+1 attained when
x= 𝑒 𝑘+1 , k = 1,2, … , p − 1.
Choice of 𝜃
#### Excercise 16.page(309).Heights in inches (x) &
Weights in pounds(y). An Introduction to Statistics
and Probability M.Nurul Islam #######
x=c(60,60,60,60,62,62,62,64,64,64,66,66,66,66,68,
68,68,70,70,70);x
y=c(115,120,130,125,130,140,120,135,130,145,135
,170,140,155,150,160,175,180,160,175);y
############
V=eigen(cov(cdata))$vectors;V
as.matrix(cdata)%*%V
plot(x,y)
data=data.frame(x,y);data
as.matrix(data)
colMeans(data)
xmv=c(rep(64.8,20));xmv ### x mean vector
ymv=c(rep(144.5,20));ymv ### y mean vector
meanmatrix=cbind(xmv,ymv);meanmatrix
cdata=data-meanmatrix;cdata
### mean centred data
plot(cdata) abline(h=0,v=0)
cor(cdata)
• ##################
cov(cdata)
eigen(cov( cdata))
xx1=c(1,0);xx1
xx2=c(0,1);xx2
vv1=eigen(cov(cdata))$vectors[,1];vv1
vv2=eigen(cov(cdata))$vectors[,2];vv2
################
theta = acos( sum(xx1*vv1) / ( sqrt(sum(xx1 * xx1)) *
sqrt(sum(vv1 * vv1)) ) );theta
theta = acos( sum(xx2*vv2) / ( sqrt(sum(xx2 * xx2)) *
sqrt(sum(vv2 * vv2)) ) );theta
###############
xx=cdata[,1]*cos( 1.41784)+cdata[,2]*sin( 1.41784);xx
yy=-cdata[,1]*sin( 1.41784)+cdata[,2]*cos( 1.41784);yy
plot(xx,yy)
abline(h=0,v=0)
V=eigen(cov(cdata))$vectors;V
tdata=as.matrix(cdata)%*%V;tdata
### transformed data
cov(tdata)
round(cov(tdata),14)
cor(tdata)
plot(tdata)
abline(h=0,v=0)
round(cor(tdata),16)
• ################ comparison of both
method ############
comparison=tdata -
as.matrix(cbind(xx,yy));comparison
round(comparison,4)
########### using package. md from original data #####
md=mahalanobis(data,colMeans(data),cov(data),inverted =F);md
## md =mahalanobis distance
######## mahalanobis distance from transformed data ########
tmd=mahalanobis(tdata,colMeans(tdata),cov(tdata),inverted =F);tmd
###### comparison ############
md-tmd
Mahalanobis distance : Manually
mu=colMeans(tdata);mu
incov=solve(cov(tdata));incov
md1=t(tdata[1,]-mu)%*%incov%*%(tdata[1,]-
mu);md1
md2=t(tdata[2,]-mu)%*%incov%*%(tdata[2,]-
mu);md2
md3=t(tdata[3,]-mu)%*%incov%*%(tdata[3,]-
mu);md3
............. ……………. …………..
md20=t(tdata[20,]-mu)%*%incov%*%(tdata[20,]-
mu);md20
md for package and manully are equal
tdata
s1=sd(tdata[,1]);s1
s2=sd(tdata[,2]);s2
xstar=c(tdata[,1])/s1;xstar
ystar=c(tdata[,2])/s2;ystar
md1=sqrt((-1.46787309)^2 + (0.1484462)^2);md1
md2=sqrt((-1.22516896 )^2 + ( 0.6020111 )^2);md2
………. ………… ……………..
Not equal to above distances……..
Why ???????
Take into account mean
12/12/2016
Statistical Distance under Rotated
Coordinate System
2
2222112
2
111
212
211
22
2
2
11
2
1
21
2),(
cossin~
sincos~
~
~
~
~
),(
)~,~(),0,0(
xaxxaxaPOd
xxx
xxx
s
x
s
x
POd
xxPO






𝑠11 𝑠22 are
sample
variances
• After some manipulation this can be written
in terms of origin variables
Whereas
Proof…………
• 𝑠11=
1
𝑛−1
Σ( 𝑥1 − 𝑥1 )
2
=
1
𝑛−1
Σ (𝑥1 cos 𝜃 + 𝑥2 sin 𝜃 − 𝑥1 cos 𝜃 − 𝑥2 sin 𝜃 )2
= 𝑐𝑜𝑠2(𝜃)𝑠11 + 2 sin 𝜃 cos 𝜃 𝑠12 + 𝑠𝑖𝑛2(𝜃)𝑠22
𝑠22 =
1
𝑛−1
Σ( 𝑥2 − 𝑥2 )
2
= Σ
1
𝑛−1
( − 𝑥1 sin 𝜃 + 𝑥2 cos 𝜃 + 𝑥1 sin(𝜃) + 𝑥2 cos 𝜃 ) 2
= 𝑐𝑜𝑠2(𝜃)𝑠22 - 2 sin 𝜃 cos 𝜃 𝑠12 + 𝑠𝑖𝑛2(𝜃)𝑠11
Continued………….
𝑑(𝑂, 𝑃)=
(𝑥1cos 𝜃 + 𝑥2 sin 𝜃) 2
𝑠11
+
(− 𝑥1 sin 𝜃 + 𝑥2 cos 𝜃)2
𝑠22
Continued………….
12/12/2016
General Statistical Distance
)])((2
))((2))((2
)(
)()([
),(
]222
[
),(
),,,(),0,,0,0(),,,,(
11,1
331113221112
2
2
2222
2
1111
1,131132112
22
222
2
111
2121
pppppp
pppp
pppp
ppp
pp
yxyxa
yxyxayxyxa
yxa
yxayxa
QPd
xxaxxaxxa
xaxaxa
POd
yyyQOxxxP















• The above distances are completely
determined by the coefficients(weights)
𝑎𝑖𝑘 ; i, k = 1,2,3, … … … p. These are can be
arranged in rectangular array as
this array (matrix) must be symmetric positive
definite.
Why Positive definite ????
Let A be a positive definite matrix .
A=C’C
X’AX= X’C’CX = (CX)’(CX) = Y’Y It obeys
all the distance property.
X’AX is distance ,
For different A it gives different distance .
• Why positive definite matrix ????????
• Ans: Spectral decomposition : the spectral
decomposition of a k×k symmetric matrix
A is given by
• Where (λ𝑖, 𝑒𝑖); 𝑖 = 1,2, … … … , 𝑘 are pair of
eigenvalues and eigenvectors.
And λ1 ≥ λ2 ≥ λ3 ≥ … … . . And if pd λ𝑖 > 0
& invertible .
4.0 4.5 5.0 5.5 6.0
2
3
4
5
λ1
λ2
𝑒1
𝑒2
• Suppose p=2. The distance from origin is
By spectral decomposition
X1
X2
𝐶
√λ1
𝐶
√λ2
Another property is
Thus
We use this property in Mahalanobis distance
12/12/2016
Necessity of Statistical Distance
Center of
gravity
Another
point
• Consider the Euclidean distances from the
point Q to the points P and the origin O.
• Obviously d(PQ) > d (QO )
 But, P appears to be more like the points in
the cluster than does the origin .
 If we take into account the variability of the
points in cluster and measure distance by
statistical distance , then Q will be closer to P
than O .
Mahalanobis distance
• The Mahalanobis distance is a descriptive
statistic that provides a relative measure of a
data point's distance from a common point. It
is a unitless measure introduced by P. C.
Mahalanobis in 1936
Intuition of Mahalanobis Distance
• Recall the eqution
d(O,P)= 𝑥′ 𝐴𝑥
=> 𝑑2
(𝑂, 𝑃) =𝑥′
𝐴𝑥
Where x=
𝑥1
𝑥2
, A=
𝑎11 𝑎12
𝑎21 𝑎22
Intuition of Mahalanobis Distance
d(O,P)= 𝑥′ 𝐴𝑥
𝑑2
𝑂, 𝑃 = 𝑥′
𝐴𝑥
Where 𝑥′
= 𝑥1 𝑥2 𝑥3 ⋯ 𝑥 𝑝 ; A=
Intuition of Mahalanobis Distance
𝑑2
(𝑃, 𝑄) = 𝑥 − 𝑦 ′
𝐴(𝑥 − 𝑦)
where, 𝑥′
= 𝑥1, 𝑥2, … , 𝑥 𝑝 ; 𝑦′
= (𝑦1, 𝑦2, … 𝑦𝑝)
A=
Mahalanobis Distance
• Mahalanobis used ,inverse of covariance
matrix Σ instead of A
• Thus 𝑑2
𝑂, 𝑃 = 𝑥′
Σ−1
𝑥 ……………..(1)
• And used 𝜇 (𝑐𝑒𝑛𝑡𝑒𝑟 𝑜𝑓 𝑔𝑟𝑎𝑣𝑖𝑡𝑦 ) instead of y
𝑑2
(𝑃, 𝑄) = (𝑥 − 𝜇 )′Σ−1
(𝑥 − 𝜇)………..(2)
Mah-
alan-
obis
dist-
ance
Mahalanobis Distance
• The above equations are nothing but
Mahalanobis Distance ……
• For example, suppose we took a single
observation from a bivariate population with
Variable X and Variable Y, and that our two
variables had the following characteristics
• single observation, X = 410 and Y = 400
The Mahalanobis distance for that single value
as:
• ghk
1.825
• Therefore, our single observation would have
a distance of 1.825 standardized units from
the mean (mean is at X = 500, Y = 500).
• If we took many such observations, graphed
them and colored them according to their
Mahalanobis values, we can see the elliptical
Mahalanobis regions come out
• The points are actually distributed along two
primary axes:
If we calculate Mahalanobis distances for each
of these points and shade them according to
their distance value, we see clear elliptical
patterns emerge:
• We can also draw actual ellipses at regions of
constant Mahalanobis values:
68%
obs
95%
obs
99.7%
obs
• Which ellipse do you choose ??????
Ans : Use the 68-95-99.7 rule .
1) about two-thirds (68%) of the points should
be within 1 unit of the origin (along the axis).
2) about 95% should be within 2 units
3)about 99.7 should be within 3 units
If
normal
Sample Mahalanobis Distancce
• The sample Mahalanobis distance is made by
replacing Σ by S and 𝜇 by 𝑋
• i.e (X- 𝑋)’ 𝑆−1
(X- 𝑋)
For sample
(X- 𝑿)’ 𝑺−𝟏
(X- 𝑿)≤ 𝝌 𝟐
𝒑 (∝)
Distribution of mahalanobis distance
Distribution of mahalanobis distance
Let 𝑋1, 𝑋2, 𝑋3, … … … , 𝑋 𝑛 be in dependent
observation from
any population with
mean 𝜇 and finite (nonsingular) covariance Σ .
Then
 𝑛 ( 𝑋 − 𝜇) is approximately 𝑁𝑝(0, Σ)
and
 𝑛 𝑋 − 𝜇 ′
𝑆−1
( 𝑋 − 𝜇) is approximately χ 𝑝
2
for n-p large
This is nothing but central limit theorem
Mahalanobis distance in R
• ########### Mahalanobis Distance ##########
• x=rnorm(100);x
• dm=matrix(x,nrow=20,ncol=5,byrow=F);dm ##dm = data matrix
• cm=colMeans(dm);cm ## cm= column means
• cov=cov(dm);cov ##cov = covariance matrix
• incov=solve(cov);incov ##incov= inverse of
covarianc matrix
Mahalanobis distance in R
• ####### MAHALANOBIS DISTANCE : MANUALY ######
• @@@ Mahalanobis distance of first
• observation@@@@@@
• ob1=dm[1,];ob1 ## first observation
• mv1=ob1-cm;mv1 ## deviatiopn of first
observation from
center of gravity
• md1=t(mv1)%*%incov%*%mv1;md1 ## mahalanobis
distance of first
observation from center of
gravity
•
Mahalanobis distance in R
• @@@@@@ Mahalanobis distance of second
observation@@@@@
• ob2=dm[2,];ob2 ## second observation
• mv2=ob2-cm;mv2 ## deviatiopn of second
• observation from
• center of gravity
• md2=t(mv2)%*%incov%*%mv2;md2 ##mahalanobis
distance of second
observation from center of
gravity
................ ……………… …..……………
Mahalanobis distance in R
………....... ……………… ……………
@@@@@ Mahalanobis distance of 20th
observation@@@@@
• Ob20=dm[,20];ob20 [## 20th observation
• mv20=ob20-cm;mv20 ## deviatiopn of 20th
observation from
center of gravity
• md20=t(mv20)%*%incov%*%mv20;md20
## mahalanobis distance of
20thobservation from
center of gravity
Mahalanobis distance in R
####### MAHALANOBIS
DISTANCE : PACKAGE ########
• md=mahalanobis(dm,cm,cov,inverted =F);md
## md =mahalanobis
distance
• md=mahalanobis(dm,cm,cov);md
Another example
• x <- matrix(rnorm(100*3), ncol = 3)
• Sx <- cov(x)
• D2 <- mahalanobis(x, colMeans(x), Sx)
• plot(density(D2, bw = 0.5),
main="Squared Mahalanobis distances, n=100,
p=3")
• qqplot(qchisq(ppoints(100), df = 3), D2,
main = expression("Q-Q plot of Mahalanobis" *
~D^2 *
" vs. quantiles of" * ~ chi[3]^2))
• abline(0, 1, col = 'gray')
• ?? mahalanobis
Acknowledgement
Prof . Mohammad Nasser .
Richard A. Johnson
& Dean W. Wichern .
& others
THANK YOU
ALL
Necessity of Statistical Distance
In home
Mother
In mess
Female
maid
Student
in mess

More Related Content

What's hot

Maximum Likelihood Estimation
Maximum Likelihood EstimationMaximum Likelihood Estimation
Maximum Likelihood Estimationguestfee8698
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionsaba khan
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysishktripathy
 
Bayes rule (Bayes Law)
Bayes rule (Bayes Law)Bayes rule (Bayes Law)
Bayes rule (Bayes Law)Tish997
 
Lect4 principal component analysis-I
Lect4 principal component analysis-ILect4 principal component analysis-I
Lect4 principal component analysis-Ihktripathy
 
Outlier analysis and anomaly detection
Outlier analysis and anomaly detectionOutlier analysis and anomaly detection
Outlier analysis and anomaly detectionShantanuDeosthale
 
Multinomial logisticregression basicrelationships
Multinomial logisticregression basicrelationshipsMultinomial logisticregression basicrelationships
Multinomial logisticregression basicrelationshipsAnirudha si
 
Logistic regression with SPSS
Logistic regression with SPSSLogistic regression with SPSS
Logistic regression with SPSSLNIPE
 
Stat 3203 -cluster and multi-stage sampling
Stat 3203 -cluster and multi-stage samplingStat 3203 -cluster and multi-stage sampling
Stat 3203 -cluster and multi-stage samplingKhulna University
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and ldaSuresh Pokharel
 
Discrete probability distribution (complete)
Discrete probability distribution (complete)Discrete probability distribution (complete)
Discrete probability distribution (complete)ISYousafzai
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionDrZahid Khan
 
4.5. logistic regression
4.5. logistic regression4.5. logistic regression
4.5. logistic regressionA M
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis pptElkana Rorio
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionDrZahid Khan
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsDerek Kane
 

What's hot (20)

Maximum Likelihood Estimation
Maximum Likelihood EstimationMaximum Likelihood Estimation
Maximum Likelihood Estimation
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysis
 
PCA
PCAPCA
PCA
 
Bayes rule (Bayes Law)
Bayes rule (Bayes Law)Bayes rule (Bayes Law)
Bayes rule (Bayes Law)
 
Lect4 principal component analysis-I
Lect4 principal component analysis-ILect4 principal component analysis-I
Lect4 principal component analysis-I
 
Outlier analysis and anomaly detection
Outlier analysis and anomaly detectionOutlier analysis and anomaly detection
Outlier analysis and anomaly detection
 
Multinomial logisticregression basicrelationships
Multinomial logisticregression basicrelationshipsMultinomial logisticregression basicrelationships
Multinomial logisticregression basicrelationships
 
Point Estimation
Point Estimation Point Estimation
Point Estimation
 
Logistic regression with SPSS
Logistic regression with SPSSLogistic regression with SPSS
Logistic regression with SPSS
 
Stat 3203 -cluster and multi-stage sampling
Stat 3203 -cluster and multi-stage samplingStat 3203 -cluster and multi-stage sampling
Stat 3203 -cluster and multi-stage sampling
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and lda
 
Discrete probability distribution (complete)
Discrete probability distribution (complete)Discrete probability distribution (complete)
Discrete probability distribution (complete)
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
4.5. logistic regression
4.5. logistic regression4.5. logistic regression
4.5. logistic regression
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis ppt
 
Time series analysis
Time series analysisTime series analysis
Time series analysis
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
 

Viewers also liked

ECCV2010: distance function and metric learning part 2
ECCV2010: distance function and metric learning part 2ECCV2010: distance function and metric learning part 2
ECCV2010: distance function and metric learning part 2zukun
 
Principal component analysis and matrix factorizations for learning (part 2) ...
Principal component analysis and matrix factorizations for learning (part 2) ...Principal component analysis and matrix factorizations for learning (part 2) ...
Principal component analysis and matrix factorizations for learning (part 2) ...zukun
 
fauvel_igarss.pdf
fauvel_igarss.pdffauvel_igarss.pdf
fauvel_igarss.pdfgrssieee
 
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdf
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdfKernel Entropy Component Analysis in Remote Sensing Data Clustering.pdf
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdfgrssieee
 
Nonlinear component analysis as a kernel eigenvalue problem
Nonlinear component analysis as a kernel eigenvalue problemNonlinear component analysis as a kernel eigenvalue problem
Nonlinear component analysis as a kernel eigenvalue problemMichele Filannino
 
Principal Component Analysis For Novelty Detection
Principal Component Analysis For Novelty DetectionPrincipal Component Analysis For Novelty Detection
Principal Component Analysis For Novelty DetectionJordan McBain
 
KPCA_Survey_Report
KPCA_Survey_ReportKPCA_Survey_Report
KPCA_Survey_ReportRandy Salm
 
Analyzing Kernel Security and Approaches for Improving it
Analyzing Kernel Security and Approaches for Improving itAnalyzing Kernel Security and Approaches for Improving it
Analyzing Kernel Security and Approaches for Improving itMilan Rajpara
 
Adaptive anomaly detection with kernel eigenspace splitting and merging
Adaptive anomaly detection with kernel eigenspace splitting and mergingAdaptive anomaly detection with kernel eigenspace splitting and merging
Adaptive anomaly detection with kernel eigenspace splitting and mergingieeepondy
 
Modeling and forecasting age-specific mortality: Lee-Carter method vs. Functi...
Modeling and forecasting age-specific mortality: Lee-Carter method vs. Functi...Modeling and forecasting age-specific mortality: Lee-Carter method vs. Functi...
Modeling and forecasting age-specific mortality: Lee-Carter method vs. Functi...hanshang
 
Explicit Signal to Noise Ratio in Reproducing Kernel Hilbert Spaces.pdf
Explicit Signal to Noise Ratio in Reproducing Kernel Hilbert Spaces.pdfExplicit Signal to Noise Ratio in Reproducing Kernel Hilbert Spaces.pdf
Explicit Signal to Noise Ratio in Reproducing Kernel Hilbert Spaces.pdfgrssieee
 
A Comparative Study between ICA (Independent Component Analysis) and PCA (Pri...
A Comparative Study between ICA (Independent Component Analysis) and PCA (Pri...A Comparative Study between ICA (Independent Component Analysis) and PCA (Pri...
A Comparative Study between ICA (Independent Component Analysis) and PCA (Pri...Sahidul Islam
 
Regularized Principal Component Analysis for Spatial Data
Regularized Principal Component Analysis for Spatial DataRegularized Principal Component Analysis for Spatial Data
Regularized Principal Component Analysis for Spatial DataWen-Ting Wang
 
Pca and kpca of ecg signal
Pca and kpca of ecg signalPca and kpca of ecg signal
Pca and kpca of ecg signales712
 
Probabilistic PCA, EM, and more
Probabilistic PCA, EM, and moreProbabilistic PCA, EM, and more
Probabilistic PCA, EM, and morehsharmasshare
 
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleDataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleHakka Labs
 
Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...zukun
 
Principal Component Analysis and Clustering
Principal Component Analysis and ClusteringPrincipal Component Analysis and Clustering
Principal Component Analysis and ClusteringUsha Vijay
 
ECG: Indication and Interpretation
ECG: Indication and InterpretationECG: Indication and Interpretation
ECG: Indication and InterpretationRakesh Verma
 
Introduction to Statistical Machine Learning
Introduction to Statistical Machine LearningIntroduction to Statistical Machine Learning
Introduction to Statistical Machine Learningmahutte
 

Viewers also liked (20)

ECCV2010: distance function and metric learning part 2
ECCV2010: distance function and metric learning part 2ECCV2010: distance function and metric learning part 2
ECCV2010: distance function and metric learning part 2
 
Principal component analysis and matrix factorizations for learning (part 2) ...
Principal component analysis and matrix factorizations for learning (part 2) ...Principal component analysis and matrix factorizations for learning (part 2) ...
Principal component analysis and matrix factorizations for learning (part 2) ...
 
fauvel_igarss.pdf
fauvel_igarss.pdffauvel_igarss.pdf
fauvel_igarss.pdf
 
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdf
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdfKernel Entropy Component Analysis in Remote Sensing Data Clustering.pdf
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdf
 
Nonlinear component analysis as a kernel eigenvalue problem
Nonlinear component analysis as a kernel eigenvalue problemNonlinear component analysis as a kernel eigenvalue problem
Nonlinear component analysis as a kernel eigenvalue problem
 
Principal Component Analysis For Novelty Detection
Principal Component Analysis For Novelty DetectionPrincipal Component Analysis For Novelty Detection
Principal Component Analysis For Novelty Detection
 
KPCA_Survey_Report
KPCA_Survey_ReportKPCA_Survey_Report
KPCA_Survey_Report
 
Analyzing Kernel Security and Approaches for Improving it
Analyzing Kernel Security and Approaches for Improving itAnalyzing Kernel Security and Approaches for Improving it
Analyzing Kernel Security and Approaches for Improving it
 
Adaptive anomaly detection with kernel eigenspace splitting and merging
Adaptive anomaly detection with kernel eigenspace splitting and mergingAdaptive anomaly detection with kernel eigenspace splitting and merging
Adaptive anomaly detection with kernel eigenspace splitting and merging
 
Modeling and forecasting age-specific mortality: Lee-Carter method vs. Functi...
Modeling and forecasting age-specific mortality: Lee-Carter method vs. Functi...Modeling and forecasting age-specific mortality: Lee-Carter method vs. Functi...
Modeling and forecasting age-specific mortality: Lee-Carter method vs. Functi...
 
Explicit Signal to Noise Ratio in Reproducing Kernel Hilbert Spaces.pdf
Explicit Signal to Noise Ratio in Reproducing Kernel Hilbert Spaces.pdfExplicit Signal to Noise Ratio in Reproducing Kernel Hilbert Spaces.pdf
Explicit Signal to Noise Ratio in Reproducing Kernel Hilbert Spaces.pdf
 
A Comparative Study between ICA (Independent Component Analysis) and PCA (Pri...
A Comparative Study between ICA (Independent Component Analysis) and PCA (Pri...A Comparative Study between ICA (Independent Component Analysis) and PCA (Pri...
A Comparative Study between ICA (Independent Component Analysis) and PCA (Pri...
 
Regularized Principal Component Analysis for Spatial Data
Regularized Principal Component Analysis for Spatial DataRegularized Principal Component Analysis for Spatial Data
Regularized Principal Component Analysis for Spatial Data
 
Pca and kpca of ecg signal
Pca and kpca of ecg signalPca and kpca of ecg signal
Pca and kpca of ecg signal
 
Probabilistic PCA, EM, and more
Probabilistic PCA, EM, and moreProbabilistic PCA, EM, and more
Probabilistic PCA, EM, and more
 
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleDataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
 
Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...
 
Principal Component Analysis and Clustering
Principal Component Analysis and ClusteringPrincipal Component Analysis and Clustering
Principal Component Analysis and Clustering
 
ECG: Indication and Interpretation
ECG: Indication and InterpretationECG: Indication and Interpretation
ECG: Indication and Interpretation
 
Introduction to Statistical Machine Learning
Introduction to Statistical Machine LearningIntroduction to Statistical Machine Learning
Introduction to Statistical Machine Learning
 

Similar to Different kind of distance and Statistical Distance

Diffusion kernels on SNP data embedded in a non-Euclidean metric
Diffusion kernels on SNP data embedded in a non-Euclidean metricDiffusion kernels on SNP data embedded in a non-Euclidean metric
Diffusion kernels on SNP data embedded in a non-Euclidean metricGota Morota
 
Classification with mixtures of curved Mahalanobis metrics
Classification with mixtures of curved Mahalanobis metricsClassification with mixtures of curved Mahalanobis metrics
Classification with mixtures of curved Mahalanobis metricsFrank Nielsen
 
Multivriada ppt ms
Multivriada   ppt msMultivriada   ppt ms
Multivriada ppt msFaeco Bot
 
Lesson 3: Problem Set 4
Lesson 3: Problem Set 4Lesson 3: Problem Set 4
Lesson 3: Problem Set 4Kevin Johnson
 
IIT JAM Mathematical Statistics - MS 2022 | Sourav Sir's Classes
IIT JAM Mathematical Statistics - MS 2022 | Sourav Sir's ClassesIIT JAM Mathematical Statistics - MS 2022 | Sourav Sir's Classes
IIT JAM Mathematical Statistics - MS 2022 | Sourav Sir's ClassesSOURAV DAS
 
IIT JAM Math 2022 Question Paper | Sourav Sir's Classes
IIT JAM Math 2022 Question Paper | Sourav Sir's ClassesIIT JAM Math 2022 Question Paper | Sourav Sir's Classes
IIT JAM Math 2022 Question Paper | Sourav Sir's ClassesSOURAV DAS
 
Some Common Fixed Point Results for Expansive Mappings in a Cone Metric Space
Some Common Fixed Point Results for Expansive Mappings in a Cone Metric SpaceSome Common Fixed Point Results for Expansive Mappings in a Cone Metric Space
Some Common Fixed Point Results for Expansive Mappings in a Cone Metric SpaceIOSR Journals
 
Class-10-Mathematics-Chapter-1-CBSE-NCERT.ppsx
Class-10-Mathematics-Chapter-1-CBSE-NCERT.ppsxClass-10-Mathematics-Chapter-1-CBSE-NCERT.ppsx
Class-10-Mathematics-Chapter-1-CBSE-NCERT.ppsxSoftcare Solution
 
Shape drawing algs
Shape drawing algsShape drawing algs
Shape drawing algsMusawarNice
 
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...mathsjournal
 
Semana 19 ecuaciones con radicales álgebra uni ccesa007
Semana 19  ecuaciones con radicales  álgebra uni ccesa007Semana 19  ecuaciones con radicales  álgebra uni ccesa007
Semana 19 ecuaciones con radicales álgebra uni ccesa007Demetrio Ccesa Rayme
 
Minimum mean square error estimation and approximation of the Bayesian update
Minimum mean square error estimation and approximation of the Bayesian updateMinimum mean square error estimation and approximation of the Bayesian update
Minimum mean square error estimation and approximation of the Bayesian updateAlexander Litvinenko
 
Output primitives in Computer Graphics
Output primitives in Computer GraphicsOutput primitives in Computer Graphics
Output primitives in Computer GraphicsKamal Acharya
 
Maths-MS_Term2 (1).pdf
Maths-MS_Term2 (1).pdfMaths-MS_Term2 (1).pdf
Maths-MS_Term2 (1).pdfAnuBajpai5
 
07-Convolution.pptx signal spectra and signal processing
07-Convolution.pptx signal spectra and signal processing07-Convolution.pptx signal spectra and signal processing
07-Convolution.pptx signal spectra and signal processingJordanJohmMallillin
 

Similar to Different kind of distance and Statistical Distance (20)

Diffusion kernels on SNP data embedded in a non-Euclidean metric
Diffusion kernels on SNP data embedded in a non-Euclidean metricDiffusion kernels on SNP data embedded in a non-Euclidean metric
Diffusion kernels on SNP data embedded in a non-Euclidean metric
 
Classification with mixtures of curved Mahalanobis metrics
Classification with mixtures of curved Mahalanobis metricsClassification with mixtures of curved Mahalanobis metrics
Classification with mixtures of curved Mahalanobis metrics
 
Quantum chaos of generic systems - Marko Robnik
Quantum chaos of generic systems - Marko RobnikQuantum chaos of generic systems - Marko Robnik
Quantum chaos of generic systems - Marko Robnik
 
Multivriada ppt ms
Multivriada   ppt msMultivriada   ppt ms
Multivriada ppt ms
 
Lesson 3: Problem Set 4
Lesson 3: Problem Set 4Lesson 3: Problem Set 4
Lesson 3: Problem Set 4
 
IIT JAM Mathematical Statistics - MS 2022 | Sourav Sir's Classes
IIT JAM Mathematical Statistics - MS 2022 | Sourav Sir's ClassesIIT JAM Mathematical Statistics - MS 2022 | Sourav Sir's Classes
IIT JAM Mathematical Statistics - MS 2022 | Sourav Sir's Classes
 
IIT JAM Math 2022 Question Paper | Sourav Sir's Classes
IIT JAM Math 2022 Question Paper | Sourav Sir's ClassesIIT JAM Math 2022 Question Paper | Sourav Sir's Classes
IIT JAM Math 2022 Question Paper | Sourav Sir's Classes
 
Math Analysis I
Math Analysis I Math Analysis I
Math Analysis I
 
Bayes gauss
Bayes gaussBayes gauss
Bayes gauss
 
Some Common Fixed Point Results for Expansive Mappings in a Cone Metric Space
Some Common Fixed Point Results for Expansive Mappings in a Cone Metric SpaceSome Common Fixed Point Results for Expansive Mappings in a Cone Metric Space
Some Common Fixed Point Results for Expansive Mappings in a Cone Metric Space
 
Class-10-Mathematics-Chapter-1-CBSE-NCERT.ppsx
Class-10-Mathematics-Chapter-1-CBSE-NCERT.ppsxClass-10-Mathematics-Chapter-1-CBSE-NCERT.ppsx
Class-10-Mathematics-Chapter-1-CBSE-NCERT.ppsx
 
Shape drawing algs
Shape drawing algsShape drawing algs
Shape drawing algs
 
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
 
kactl.pdf
kactl.pdfkactl.pdf
kactl.pdf
 
Semana 19 ecuaciones con radicales álgebra uni ccesa007
Semana 19  ecuaciones con radicales  álgebra uni ccesa007Semana 19  ecuaciones con radicales  álgebra uni ccesa007
Semana 19 ecuaciones con radicales álgebra uni ccesa007
 
Minimum mean square error estimation and approximation of the Bayesian update
Minimum mean square error estimation and approximation of the Bayesian updateMinimum mean square error estimation and approximation of the Bayesian update
Minimum mean square error estimation and approximation of the Bayesian update
 
Output primitives in Computer Graphics
Output primitives in Computer GraphicsOutput primitives in Computer Graphics
Output primitives in Computer Graphics
 
maths 12th.pdf
maths 12th.pdfmaths 12th.pdf
maths 12th.pdf
 
Maths-MS_Term2 (1).pdf
Maths-MS_Term2 (1).pdfMaths-MS_Term2 (1).pdf
Maths-MS_Term2 (1).pdf
 
07-Convolution.pptx signal spectra and signal processing
07-Convolution.pptx signal spectra and signal processing07-Convolution.pptx signal spectra and signal processing
07-Convolution.pptx signal spectra and signal processing
 

More from Khulna University

Stat 2153 Introduction to Queiueng Theory
Stat 2153 Introduction to Queiueng TheoryStat 2153 Introduction to Queiueng Theory
Stat 2153 Introduction to Queiueng TheoryKhulna University
 
Stat 2153 Stochastic Process and Markov chain
Stat 2153 Stochastic Process and Markov chainStat 2153 Stochastic Process and Markov chain
Stat 2153 Stochastic Process and Markov chainKhulna University
 
Stat 3203 -sampling errors and non-sampling errors
Stat 3203 -sampling errors  and non-sampling errorsStat 3203 -sampling errors  and non-sampling errors
Stat 3203 -sampling errors and non-sampling errorsKhulna University
 
Stat 3203 -multphase sampling
Stat 3203 -multphase samplingStat 3203 -multphase sampling
Stat 3203 -multphase samplingKhulna University
 
Stat 1163 -statistics in environmental science
Stat 1163 -statistics in environmental scienceStat 1163 -statistics in environmental science
Stat 1163 -statistics in environmental scienceKhulna University
 
Stat 1163 -correlation and regression
Stat 1163 -correlation and regressionStat 1163 -correlation and regression
Stat 1163 -correlation and regressionKhulna University
 
Regression and Classification: An Artificial Neural Network Approach
Regression and Classification: An Artificial Neural Network ApproachRegression and Classification: An Artificial Neural Network Approach
Regression and Classification: An Artificial Neural Network ApproachKhulna University
 

More from Khulna University (10)

Stat 2153 Introduction to Queiueng Theory
Stat 2153 Introduction to Queiueng TheoryStat 2153 Introduction to Queiueng Theory
Stat 2153 Introduction to Queiueng Theory
 
Stat 2153 Stochastic Process and Markov chain
Stat 2153 Stochastic Process and Markov chainStat 2153 Stochastic Process and Markov chain
Stat 2153 Stochastic Process and Markov chain
 
Stat 3203 -sampling errors and non-sampling errors
Stat 3203 -sampling errors  and non-sampling errorsStat 3203 -sampling errors  and non-sampling errors
Stat 3203 -sampling errors and non-sampling errors
 
Stat 3203 -multphase sampling
Stat 3203 -multphase samplingStat 3203 -multphase sampling
Stat 3203 -multphase sampling
 
Stat 3203 -pps sampling
Stat 3203 -pps samplingStat 3203 -pps sampling
Stat 3203 -pps sampling
 
Ds 2251 -_hypothesis test
Ds 2251 -_hypothesis testDs 2251 -_hypothesis test
Ds 2251 -_hypothesis test
 
Stat 1163 -statistics in environmental science
Stat 1163 -statistics in environmental scienceStat 1163 -statistics in environmental science
Stat 1163 -statistics in environmental science
 
Stat 1163 -correlation and regression
Stat 1163 -correlation and regressionStat 1163 -correlation and regression
Stat 1163 -correlation and regression
 
Introduction to matlab
Introduction to matlabIntroduction to matlab
Introduction to matlab
 
Regression and Classification: An Artificial Neural Network Approach
Regression and Classification: An Artificial Neural Network ApproachRegression and Classification: An Artificial Neural Network Approach
Regression and Classification: An Artificial Neural Network Approach
 

Recently uploaded

Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxjana861314
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 

Recently uploaded (20)

Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 

Different kind of distance and Statistical Distance

  • 2. Md. Menhazul Abedin M.Sc. Student Dept. of Statistics Rajshahi University Mob: 01751385142 Email: menhaz70@gmail.com
  • 3. Objectives • To know about the meaning of statistical distance and it’s relation and difference with general or Euclidean distance
  • 4. Content Definition of Euclidean distance Concept & intuition of statistical distance Definition of Statistical distance Necessity of statistical distance Concept of Mahalanobis distance (population &sample) Distribution of Mahalanobis distance Mahalanobis distance in R Acknowledgement
  • 5. Euclidean Distance from origin (0,0) (X,Y) X Y
  • 6. Euclidean Distance P(X,Y) Y O (0,0) X By Pythagoras 𝑑(𝑜, 𝑝) = 𝑋2 + 𝑌2
  • 8.
  • 9. we see that two specific points in each picture Our problem is to determine the length between two points . But how ?????????? Assume that these pictures are placed in two dimensional spaces and points are joined by a straight line
  • 10. Let 1st point is (𝑥1,𝑦1) and 2nd point is (𝑥2, 𝑦2) then distance is D= √ ( (𝑥1−𝑥2)2 + (𝑦1 − 𝑦2)2 ) What will be happen when dimension is three
  • 12. Distance is given by • Points are (x1,x2,x3) and (y1,y2,y3) (𝑥1 − 𝑦1)2+(𝑥2 − 𝑦2)2+(𝑥3 − 𝑦3)2
  • 13. For n dimension it can be written as the following expression and named as Euclidian distance 22 22 2 11 2121 )()()(),( ),,,(),,,,( pp pp yxyxyxQPd yyyQxxxP   
  • 14. 12/12/2016 14 Properties of Euclidean Distance and Mathematical Distance • Usual human concept of distance is Eucl. Dist. • Each coordinate contributes equally to the distance 22 22 2 11 2121 )()()(),( ),,,(),,,,( pp pp yxyxyxQPd yyyQxxxP    14 Mathematicians, generalizing its three properties , 1) d(P,Q)=d(Q,P). 2) d(P,Q)=0 if and only if P=Q and 3) d(P,Q)=<d(P,R)+d(R,Q) for all R, define distance on any set.
  • 17. • The Manhattan distance is the simple sum of the horizontal and vertical components, whereas the diagonal distance might be computed by applying the Pythagorean Theorem .
  • 18. • Red: Manhattan distance. • Green: diagonal, straight-line distance. • Blue, yellow: equivalent Manhattan distances.
  • 19. • Manhattan distance 12 unit • Diagonal or straight-line distance or Euclidean distance is 62 + 62 =6√2 We observe that Euclidean distance is less than Manhattan distance
  • 21. Manhattan Distance • The taxicab distance between (p1,p2) and (q1,q2) is │p1-q1│+│p2-q2│
  • 22. Relationship between Manhattan & Euclidean distance. 7 Block 6 Block
  • 23. Relationship between Manhattan & Euclidean distance. • It now seems that the distance from A to C is 7 blocks, while the distance from A to B is 6 blocks. • Unless we choose to go off-road, B is now closer to A than C. • Taxicab distance is sometimes equal to Euclidean distance, but otherwise it is greater than Euclidean distance. Euclidean distance <Taxicab distance Is it true always ??? Or for n dimension ???
  • 24. Proof…….. Absolute values guarantee non-negative value Addition property of inequality
  • 27. For high dimension • It holds for high dimensional case • Σ │𝑥𝑖 − 𝑦𝑖│2 ≤ Σ │𝑥𝑖 − 𝑦𝑖│2 + 2Σ│𝑥𝑖 − 𝑥𝑖││𝑥𝑗 − 𝑥𝑗│ Which implies Σ (𝑥𝑖 − 𝑦𝑖)2 ≤ Σ│𝑥𝑖 − 𝑥𝑗│ 𝑑 𝐸 ≤ 𝑑 𝑇
  • 28. 12/12/2016 Statistical Distance • Weight coordinates subject to a great deal of variability less heavily than those that are not highly variable Whoisnearerto datasetifitwere point? Same distance from origin
  • 29. • Here variability in x1 axis > variability in x2 axis  Is the same distance meaningful from origin ??? Ans: no But, how we take into account the different variability ???? Ans : Give different weights on axes.
  • 30. 12/12/2016 Statistical Distance for Uncorrelated Data     22 2 2 11 2 12* 2 2* 1 222 * 2111 * 1 21 ),( /,/ )0,0(),,( s x s x xxPOd sxxsxx OxxP   weight Standardization
  • 31. all point that have coordinates (x1,x2) and are a constant squared distance , c2 from the origin must satisfy 𝑥12 𝑠11 + 𝑥22 𝑠22 =𝑐2 But … how to choose c ????? It’s a problem Choose c as 95% observation fall in this area …. 𝑠11 > 𝑠22 = > 1 𝑠11 < 1 𝑠22
  • 32. 12/12/2016 Ellipse of Constant Statistical Distance for Uncorrelated Data 11sc 11sc 22sc 22sc x1 x2 0
  • 33. • This expression can be generalized as ……… statistical distance from an arbitrary point P=(x1,x2) to any fixed point Q=(y1,y2) ;lk;lk; For P dimension……………..
  • 34. Remark : 1) The distance of P to the origin O is obtain by setting all 𝑦𝑖 = 0 2) If all 𝑠𝑖𝑖 are equal Euclidean distance formula is appropriate
  • 36. • How do you measure the statistical distance of the above data set ?????? • Ans : Firstly make it uncorrelated . • But why and how………??????? • Ans: Rotate the axis keeping origin fixed.
  • 38. Rotation of axes keeping origin fixed O M R X1 N Q 𝑥1 P(x1,x2) x2 𝑥2 𝜃 𝜃
  • 39. x=OM =OR-MR = 𝑥1 cos𝜃 – 𝑥2 sin𝜃 ……. (i) y=MP =QR+NP = 𝑥1 sin𝜃 + 𝑥2 cos𝜃 ……….(ii)
  • 40. • The solution of the above equations
  • 41. Choice of 𝜃 What 𝜃 will you choice ? How will you do it ?  Data matrix → Centeralized data matrix → Covariance of data matrix → Eigen vector Theta = angle between 1st eigen vector and [1,0] or angle between 2nd eigen vector and [0,1]
  • 42. Why is that angle between 1st eigen vector and [0,1] or angle between 2nd eigen vector and [1,0] ?? Ans: Let B be a (p by p) positive definite matrix with eigenvalues λ1≥λ2≥λ3≥ … … . . ≥ λp>0 and associated normalized eigenvectors 𝑒1, 𝑒2, … … … , 𝑒 𝑝.Then 𝑚𝑎𝑥 𝑥≠0 𝑥′ 𝐵𝑥 𝑥′ 𝑥 = λ1 attained when x= 𝑒1 𝑚𝑖𝑛 𝑥≠0 𝑥′ 𝐵𝑥 𝑥′ 𝑥 = λ 𝑝 attained when x= 𝑒 𝑝
  • 43. 𝑚𝑎𝑥 𝑥⊥𝑒1,𝑒2,…,𝑒 𝑘 𝑥′ 𝐵𝑥 𝑥′ 𝑥 = λ 𝑘+1 attained when x= 𝑒 𝑘+1 , k = 1,2, … , p − 1.
  • 44. Choice of 𝜃 #### Excercise 16.page(309).Heights in inches (x) & Weights in pounds(y). An Introduction to Statistics and Probability M.Nurul Islam ####### x=c(60,60,60,60,62,62,62,64,64,64,66,66,66,66,68, 68,68,70,70,70);x y=c(115,120,130,125,130,140,120,135,130,145,135 ,170,140,155,150,160,175,180,160,175);y ############ V=eigen(cov(cdata))$vectors;V as.matrix(cdata)%*%V plot(x,y)
  • 45. data=data.frame(x,y);data as.matrix(data) colMeans(data) xmv=c(rep(64.8,20));xmv ### x mean vector ymv=c(rep(144.5,20));ymv ### y mean vector meanmatrix=cbind(xmv,ymv);meanmatrix cdata=data-meanmatrix;cdata ### mean centred data plot(cdata) abline(h=0,v=0) cor(cdata)
  • 47. ################ theta = acos( sum(xx1*vv1) / ( sqrt(sum(xx1 * xx1)) * sqrt(sum(vv1 * vv1)) ) );theta theta = acos( sum(xx2*vv2) / ( sqrt(sum(xx2 * xx2)) * sqrt(sum(vv2 * vv2)) ) );theta ############### xx=cdata[,1]*cos( 1.41784)+cdata[,2]*sin( 1.41784);xx yy=-cdata[,1]*sin( 1.41784)+cdata[,2]*cos( 1.41784);yy plot(xx,yy) abline(h=0,v=0)
  • 49. • ################ comparison of both method ############ comparison=tdata - as.matrix(cbind(xx,yy));comparison round(comparison,4)
  • 50. ########### using package. md from original data ##### md=mahalanobis(data,colMeans(data),cov(data),inverted =F);md ## md =mahalanobis distance ######## mahalanobis distance from transformed data ######## tmd=mahalanobis(tdata,colMeans(tdata),cov(tdata),inverted =F);tmd ###### comparison ############ md-tmd
  • 51. Mahalanobis distance : Manually mu=colMeans(tdata);mu incov=solve(cov(tdata));incov md1=t(tdata[1,]-mu)%*%incov%*%(tdata[1,]- mu);md1 md2=t(tdata[2,]-mu)%*%incov%*%(tdata[2,]- mu);md2 md3=t(tdata[3,]-mu)%*%incov%*%(tdata[3,]- mu);md3 ............. ……………. ………….. md20=t(tdata[20,]-mu)%*%incov%*%(tdata[20,]- mu);md20 md for package and manully are equal
  • 52. tdata s1=sd(tdata[,1]);s1 s2=sd(tdata[,2]);s2 xstar=c(tdata[,1])/s1;xstar ystar=c(tdata[,2])/s2;ystar md1=sqrt((-1.46787309)^2 + (0.1484462)^2);md1 md2=sqrt((-1.22516896 )^2 + ( 0.6020111 )^2);md2 ………. ………… …………….. Not equal to above distances…….. Why ??????? Take into account mean
  • 53. 12/12/2016 Statistical Distance under Rotated Coordinate System 2 2222112 2 111 212 211 22 2 2 11 2 1 21 2),( cossin~ sincos~ ~ ~ ~ ~ ),( )~,~(),0,0( xaxxaxaPOd xxx xxx s x s x POd xxPO       𝑠11 𝑠22 are sample variances
  • 54. • After some manipulation this can be written in terms of origin variables Whereas
  • 55. Proof………… • 𝑠11= 1 𝑛−1 Σ( 𝑥1 − 𝑥1 ) 2 = 1 𝑛−1 Σ (𝑥1 cos 𝜃 + 𝑥2 sin 𝜃 − 𝑥1 cos 𝜃 − 𝑥2 sin 𝜃 )2 = 𝑐𝑜𝑠2(𝜃)𝑠11 + 2 sin 𝜃 cos 𝜃 𝑠12 + 𝑠𝑖𝑛2(𝜃)𝑠22 𝑠22 = 1 𝑛−1 Σ( 𝑥2 − 𝑥2 ) 2 = Σ 1 𝑛−1 ( − 𝑥1 sin 𝜃 + 𝑥2 cos 𝜃 + 𝑥1 sin(𝜃) + 𝑥2 cos 𝜃 ) 2 = 𝑐𝑜𝑠2(𝜃)𝑠22 - 2 sin 𝜃 cos 𝜃 𝑠12 + 𝑠𝑖𝑛2(𝜃)𝑠11
  • 56. Continued…………. 𝑑(𝑂, 𝑃)= (𝑥1cos 𝜃 + 𝑥2 sin 𝜃) 2 𝑠11 + (− 𝑥1 sin 𝜃 + 𝑥2 cos 𝜃)2 𝑠22
  • 59. • The above distances are completely determined by the coefficients(weights) 𝑎𝑖𝑘 ; i, k = 1,2,3, … … … p. These are can be arranged in rectangular array as this array (matrix) must be symmetric positive definite.
  • 60. Why Positive definite ???? Let A be a positive definite matrix . A=C’C X’AX= X’C’CX = (CX)’(CX) = Y’Y It obeys all the distance property. X’AX is distance , For different A it gives different distance .
  • 61. • Why positive definite matrix ???????? • Ans: Spectral decomposition : the spectral decomposition of a k×k symmetric matrix A is given by • Where (λ𝑖, 𝑒𝑖); 𝑖 = 1,2, … … … , 𝑘 are pair of eigenvalues and eigenvectors. And λ1 ≥ λ2 ≥ λ3 ≥ … … . . And if pd λ𝑖 > 0 & invertible .
  • 62. 4.0 4.5 5.0 5.5 6.0 2 3 4 5 λ1 λ2 𝑒1 𝑒2
  • 63. • Suppose p=2. The distance from origin is By spectral decomposition X1 X2 𝐶 √λ1 𝐶 √λ2
  • 64. Another property is Thus We use this property in Mahalanobis distance
  • 65. 12/12/2016 Necessity of Statistical Distance Center of gravity Another point
  • 66. • Consider the Euclidean distances from the point Q to the points P and the origin O. • Obviously d(PQ) > d (QO )  But, P appears to be more like the points in the cluster than does the origin .  If we take into account the variability of the points in cluster and measure distance by statistical distance , then Q will be closer to P than O .
  • 67. Mahalanobis distance • The Mahalanobis distance is a descriptive statistic that provides a relative measure of a data point's distance from a common point. It is a unitless measure introduced by P. C. Mahalanobis in 1936
  • 68. Intuition of Mahalanobis Distance • Recall the eqution d(O,P)= 𝑥′ 𝐴𝑥 => 𝑑2 (𝑂, 𝑃) =𝑥′ 𝐴𝑥 Where x= 𝑥1 𝑥2 , A= 𝑎11 𝑎12 𝑎21 𝑎22
  • 69. Intuition of Mahalanobis Distance d(O,P)= 𝑥′ 𝐴𝑥 𝑑2 𝑂, 𝑃 = 𝑥′ 𝐴𝑥 Where 𝑥′ = 𝑥1 𝑥2 𝑥3 ⋯ 𝑥 𝑝 ; A=
  • 70. Intuition of Mahalanobis Distance 𝑑2 (𝑃, 𝑄) = 𝑥 − 𝑦 ′ 𝐴(𝑥 − 𝑦) where, 𝑥′ = 𝑥1, 𝑥2, … , 𝑥 𝑝 ; 𝑦′ = (𝑦1, 𝑦2, … 𝑦𝑝) A=
  • 71. Mahalanobis Distance • Mahalanobis used ,inverse of covariance matrix Σ instead of A • Thus 𝑑2 𝑂, 𝑃 = 𝑥′ Σ−1 𝑥 ……………..(1) • And used 𝜇 (𝑐𝑒𝑛𝑡𝑒𝑟 𝑜𝑓 𝑔𝑟𝑎𝑣𝑖𝑡𝑦 ) instead of y 𝑑2 (𝑃, 𝑄) = (𝑥 − 𝜇 )′Σ−1 (𝑥 − 𝜇)………..(2) Mah- alan- obis dist- ance
  • 72. Mahalanobis Distance • The above equations are nothing but Mahalanobis Distance …… • For example, suppose we took a single observation from a bivariate population with Variable X and Variable Y, and that our two variables had the following characteristics
  • 73. • single observation, X = 410 and Y = 400 The Mahalanobis distance for that single value as:
  • 75. • Therefore, our single observation would have a distance of 1.825 standardized units from the mean (mean is at X = 500, Y = 500). • If we took many such observations, graphed them and colored them according to their Mahalanobis values, we can see the elliptical Mahalanobis regions come out
  • 76. • The points are actually distributed along two primary axes:
  • 77.
  • 78. If we calculate Mahalanobis distances for each of these points and shade them according to their distance value, we see clear elliptical patterns emerge:
  • 79.
  • 80. • We can also draw actual ellipses at regions of constant Mahalanobis values: 68% obs 95% obs 99.7% obs
  • 81. • Which ellipse do you choose ?????? Ans : Use the 68-95-99.7 rule . 1) about two-thirds (68%) of the points should be within 1 unit of the origin (along the axis). 2) about 95% should be within 2 units 3)about 99.7 should be within 3 units
  • 83. Sample Mahalanobis Distancce • The sample Mahalanobis distance is made by replacing Σ by S and 𝜇 by 𝑋 • i.e (X- 𝑋)’ 𝑆−1 (X- 𝑋)
  • 84. For sample (X- 𝑿)’ 𝑺−𝟏 (X- 𝑿)≤ 𝝌 𝟐 𝒑 (∝) Distribution of mahalanobis distance
  • 85. Distribution of mahalanobis distance Let 𝑋1, 𝑋2, 𝑋3, … … … , 𝑋 𝑛 be in dependent observation from any population with mean 𝜇 and finite (nonsingular) covariance Σ . Then  𝑛 ( 𝑋 − 𝜇) is approximately 𝑁𝑝(0, Σ) and  𝑛 𝑋 − 𝜇 ′ 𝑆−1 ( 𝑋 − 𝜇) is approximately χ 𝑝 2 for n-p large This is nothing but central limit theorem
  • 86. Mahalanobis distance in R • ########### Mahalanobis Distance ########## • x=rnorm(100);x • dm=matrix(x,nrow=20,ncol=5,byrow=F);dm ##dm = data matrix • cm=colMeans(dm);cm ## cm= column means • cov=cov(dm);cov ##cov = covariance matrix • incov=solve(cov);incov ##incov= inverse of covarianc matrix
  • 87. Mahalanobis distance in R • ####### MAHALANOBIS DISTANCE : MANUALY ###### • @@@ Mahalanobis distance of first • observation@@@@@@ • ob1=dm[1,];ob1 ## first observation • mv1=ob1-cm;mv1 ## deviatiopn of first observation from center of gravity • md1=t(mv1)%*%incov%*%mv1;md1 ## mahalanobis distance of first observation from center of gravity •
  • 88. Mahalanobis distance in R • @@@@@@ Mahalanobis distance of second observation@@@@@ • ob2=dm[2,];ob2 ## second observation • mv2=ob2-cm;mv2 ## deviatiopn of second • observation from • center of gravity • md2=t(mv2)%*%incov%*%mv2;md2 ##mahalanobis distance of second observation from center of gravity ................ ……………… …..……………
  • 89. Mahalanobis distance in R ………....... ……………… …………… @@@@@ Mahalanobis distance of 20th observation@@@@@ • Ob20=dm[,20];ob20 [## 20th observation • mv20=ob20-cm;mv20 ## deviatiopn of 20th observation from center of gravity • md20=t(mv20)%*%incov%*%mv20;md20 ## mahalanobis distance of 20thobservation from center of gravity
  • 90. Mahalanobis distance in R ####### MAHALANOBIS DISTANCE : PACKAGE ######## • md=mahalanobis(dm,cm,cov,inverted =F);md ## md =mahalanobis distance • md=mahalanobis(dm,cm,cov);md
  • 91. Another example • x <- matrix(rnorm(100*3), ncol = 3) • Sx <- cov(x) • D2 <- mahalanobis(x, colMeans(x), Sx)
  • 92. • plot(density(D2, bw = 0.5), main="Squared Mahalanobis distances, n=100, p=3") • qqplot(qchisq(ppoints(100), df = 3), D2, main = expression("Q-Q plot of Mahalanobis" * ~D^2 * " vs. quantiles of" * ~ chi[3]^2)) • abline(0, 1, col = 'gray') • ?? mahalanobis
  • 93. Acknowledgement Prof . Mohammad Nasser . Richard A. Johnson & Dean W. Wichern . & others
  • 95. Necessity of Statistical Distance In home Mother In mess Female maid Student in mess