1. Measures of Dispersion
&
Normal Distribution
By
Dr. Dinesh kumar Meena, Pharm.D
Ph.D Research Scholar
Department of Pharmacology,
Jawaharlal Institute of Postgraduate Medical Education & Research (JIPMER)
3. Central Tendency
Summarises data with a single value which can represent the entire data
Mean, Median & Mode
This value does not reveal the variability present in Data
Measures of dispersion used to quantify the variability of data
4. • The average income of three families (Ram, Rahim
& Maria) is same
• But there are considerable differences in individual
incomes
(Ram’s family differences in income are comparatively
lower , In Rahim’s family differences are higher and In
maria’s family differences is highest)
• Average tell only one aspect of distribution i.e.
representative size of the values
• Dispersion is the extent to which values in a
distribution differ from the average of the
distribution
6. Range
• Range (R) is the difference between the largest value (L) and smallest value
(S) in a distribution. R = L – S
Range of income in
Ram’s family : 18000-12000 = 6000
Rahim’s family : 22000-7000 = 15000
Maria’s family = 50000 – 0 = 50000
• High value of range represent high dispersion
Mariy’s family > Rahim’s family > Ram’s family
7. • Range is not based on all the values in data set
• As long as minimum and maximum values remain unaltered, any change in
other values doesn’t affect range.
• Range can not be calculates for open-ended frequency distribution.
Those in which either the lower limit of the lowest
class or upper limit of the highest class or both are not
specified.
8. Quartile Deviation
• Entire data is divided into four equal parts, each containing 25% of the values.
• The upper and lower quartile (Q3 & Q1 respectively) are used to calculate
interquartile range (Q3-Q1).
• Half of the interquartile range is called Quartile Deviation
• Quartile Deviation = Q3-Q1
2
9. Calculate the quartile deviation of following observations:
20, 25, 29, 30, 35, 39, 41, 48, 51, 60 and 70.
Q. D. Q3-Q1
2
Q1 = n + 1 Q 3 = 3(n + 1) n = number of observations
4 4
Q1 = 11+1 Q3 = 3 (11 +1)
4 4
Q1 = 3th value i.e. 29 Q3 = 9th value i.e. 51
Q.D. = 51-29 = 11
2
10. Mean Deviation
• Range and Quartile Deviation gives a good idea about dispersion but do not
attempt to calculate how far the values are, from their average.
• Mean Deviation and Standard Deviation are two measures of dispersion
which are based upon deviation of values from their average.
• Mean deviation is simply the arithmetic mean of the differences of the
values from their averages ( Average used is either the arithmetic mean or
median)
11. Objective is to find a location to establish a college so
that the average distance travelled by students is
minimum
We found that student will have to travel more, on average , if the college is situated at town A or E. but if
college is some where in the idle, student likely to travel less.
if situated at town A : Average distance travelled by student will be 9.94 km
if situated at town C : Average distance travelled by student will be 5.9 km
12. Calculating Mean Deviation
For ungrouped data For grouped data
From Arithmetic Mean From Arithmetic Mean
From Median From Median
13. Mean Deviation from Arithmetic Mean for ungrouped data
Steps :
1. Calculate the Arithmetic Mean (A.M.) of the values
2. Calculate the difference between each value and A.M. (called deviations
and dented by d)
3. Calculate the A.M. of these differences (Mean Deviations).
14. Example: Calculate Mean Deviation of following values by using Arithmetic
Mean
2,4,7,8,9
1. A.M. of values = 6
2. A.M. of differences (deviations) = Mean Deviations =
12/5 = 2.4
15. Mean Deviation from Median for ungrouped data
Steps :
1. Calculate the Median of the values
2. Calculate the difference between each value Median (called deviations and
dented by d)
3. Calculate the A.M. of these differences (Mean Deviations).
16. Example: Calculate Mean Deviation of following values by using Median
2,4,7,8,9
1. Median of values = 7
2. A.M. of differences (deviations) = Mean
Deviations = 11/5 = 2.2
17. Mean Deviation from Arithmetic Mean for grouped data
Steps :
1. Find the mid point of each class (m.p.)
2. Calculate the sum of frequencies (Ʃ f)
3. Calculate the difference (deviation) of the class mid point from mean
(d)
4. Multiply each d value with its corresponding frequency to get f(d)
values.
5. Sum all f(d) values.
6. Apply the following formula M.D. = Ʃ f (d)
Ʃ f
18. C.I f m.p. d (f-m.p.) f x d
10-20 5 15 25 125
20-30 8 25 15 120
30-50 16 40 0 0
50-70 8 60 20 160
70-80 3 75 35 105
Ʃ f = 40 Ʃ = 510
M.D. =
510 / 40 = 12.75
20. • Mean Deviation is based on all values thus a change in one value will affect it.
• It the least when calculated from Median.
• It is highest when calculated from Mean
21. Standard Deviation
Standard Deviation is the positive square root of the mean of squared
deviations from mean.
Steps: If there are five values X1, X2, X3, X4, X5
1. Mean of these values is calculated
2. Deviations of the values from mean is calculated
3. These deviations then are squared
4. Mean of these squared deviations is the variance.
5. Positive squared root of the variance is the standard deviations.
S.D. = Ʃ d2
n
23. Calculating Standard Deviation for ungrouped data by Actual Mean Method
Calculate SD of following values : 5, 10, 25, 30, 50
Mean of values Ʃ X= 24
X - Ʃ X is calculated for each value
1270/5
= 15.93
S.D. = Ʃ d2
n
24. Calculating Standard Deviation for ungrouped data by Direct Method
Calculate SD of following values : 5, 10, 25, 30, 50
= 15.93
26. 1. Calculate the mean of distribution
2. Calculate the Deviation of mid values from Mean
3. Multiply deviations from their corresponding frequencies
4. Calculate Values by multiplying with “d”
5. Apply formula as following
27. • SD is most widely used measure of dispersion
• SD is based on all values therefore a change in one value affects the value of
SD.
• It is independent of origin but not of scale.
29. Frequency Distribution
(Classifying the raw data of quantitative variables)
Observed Theoretical/ Probability
Which are observed by actual
observations or experiments
Ex.. Frequency distribution of marks
obtained by 50 students in test.
Which are not obtained by actual
experiments
Ex. What are the possibilities of getting
head if I toss the coin 150 times.
1. Binomial distribution
2. Poisson distribution
3. Normal distribution
30. Normal Distribution
1. It is a frequency curve based upon large no. of observations and small
intervals
2. Also known as Gaussian curve or probability curve
3. Normal distribution is most commonly encountered distribution in
statistics.
4. It describes the empirical distribution of certain measurements like weight
and height of individuals, blood pressure etc.
31. Characteristics of normal distribution
1. Continuous
2. Bilateral symmetrical
3. Smooth, Bell shaped curve
4. Mean = Median = Mode
5. Central part is convex and it has two inflections and two lines never touch
the base line, since based on infinite number of observations.
6. Shape of normal curve determined by : Mean and SD
33. Example : Summer job income of 16 students
Income
(INR)
Frequency
( No. of students)
500 1
1000 2
1500 3
2000 4
2500 3
3000 2
3500 1
1
2
3
4
2
0
0
0
1
5
0
0
1
0
0
0
5
0
0
2
5
0
0
3
0
0
0
3
5
0
0
34. Skewness
• Asymmetric distribution is also know as skewed distribution
• It represent an imbalance and asymmetry from mean of a data
distribution.
• It occurs when the frequency curve is distorted either left or right side which
indicated presence of skewness.
• Left side skewness: frequency curve is distorted left side (Left tail)
• Right side skewness: frequency curve distorted right side (Right tail).
35. Right side Skewness
Mean lies extreme Right side of the curve
Mean >Median >Mode
Tail lies at the right side of curve
Left side Skewness
Mean lies extreme Left side of the
curve
Mean <Median <Mode
Tail lies at the Left side of curve
36. Measurement of skewness:
1. With mean and median
Skewness = 3 (Mean –Median) / Standard Deviation
2. With Mean and Mode
Skewness = Mean – Mode/Standard Deviation
3. With Quartile values
Skewness = Q3 + Q1 – 2 (Median) / Q3 – Q1
37. Calculate the skewness for following observations
Skewness = Mean – Mode / SD
Mean = 40.5
SD = 17.168
Mode = ?
Mode = L1 + F1-F0 x 10 = 35
2F1-F0-F2
L1: Lower limit of class interval i.e. 30
F1: Highest frequency i.e. 16
F0: Previous frequency of highest frequency i.e. 8
F2: Next frequency of highest frequency i.e. 8
Skewness =
40.5 – 35 / 17.165
= 0.32
38. Z Score (Standard Score)
• Deviation from the mean in a normal distribution or curve is called relative or
standard normal deviate and
• It is denoted by the symbol Z.
• It measured in terms of SDs and indicates how much an observation is bigger
or smaller than mean in units of SD.
Formula =
• Z core = + ve value : it is above the mean (Right side)
• Z score = -ve value : it is below the mean (Left side )
39. Example:
Result of a practice exam of student A and group of 100 student is given as following.
Practice exam result of student A Practice exam result of group of 100 student
Z score = Observation – Mean / SD
Z score for Technique : 15-18/4.8 = - 0.62 (Below the mean)
for Research & Techniques : 12-9/1.2 = 2.5 (Above the mean)
for Subject : 34-32/1.8 = 1.1 (Above the mean)
Techniques 15
Research & Biostatistics 12
Subject 34
Techniques 18 4.8
Research & biostatistics 9 1.2
Subject 32 1.8