2. Preview Questions
• What are commonly used measures of
central tendency? What do they tell you?
• How do variance and standard deviation
measure data spread? Why is this
important?
3. Central Tendency
• In general terms, central tendency is a statistical measure that
determines a single value that accurately describes the center or middle
point of a distribution.
• Measures of Central tendency are also called measures of location
• By identifying the "average score," central tendency allows researchers
to summarize or condense a large set of data into a single value
• Thus, central tendency serves as a descriptive statistic because it allows
researchers to describe or present a set of data in a very simplified,
concise form.
• In addition, it is possible to compare two (or more) sets of data by simply
comparing the average score (central tendency) for one set versus the
average score for another set.
4. Dispersion
• Dispersion is the spread of data in a
distribution, that is the extent to which the
observations are scattered.
5. Types of Average
• Mathematical Averages :
– Arithmetic Mean
• Computed by three methods :
a) Direct Method
b) Assumed Mean Method
c) Step Deviation Method
– Weighted Mean
– Geometric mean
– Harmonic Mean
• Positional Averages
– Median
– Mode
6. Mean, Median, Mode Concepts
• The “mean” is the “average” you’re used to, where you add up all
the numbers and then divide by the number of numbers.
• The “median” is the “middle” value in the list of numbers. To
find the median, your numbers have to be listed in numerical order
from smallest to largest, so you may have to rewrite your list
before you can find the median.
• The “mode” is the value that occurs most often. If no number in
the list is repeated, then there is no mode for the list.
• The “range” of a list a numbers is just the difference between the
largest and smallest values. Let us understand the concepts better
by use of some examples.
7. • The formula for the place to find the median is “([the number of data
points] + 1) ÷ 2″, but we don’t have to use this formula. We can just count in
from both ends of the list until you meet in the middle, if you prefer,
especially if your list is short. The formula works when the number of terms
in the series is odd. In case there are even number of numbers, median will
be average of two middle numbers in the list.
• MEAN VALUE: Mean value refers to the average of a set of values. The
simplest way to find the mean is sum of all the values in the set divided by
total number of values in the set.
• Mean = Sum of all values/total number of values
Example 1: Suppose we have the marks of students in a class test of 50 marks as :
12, 23, 32, 45, 46, 33, 35, 27, 23, 28, 27, 27, 35, 41, 43, 27, 15, 18, 27, 29, 27.
The mean marks or the Arithmetic Mean is computed as : Mean =
(12+23+32+45+46+33+35+27+23+28+27+27+35+41+43+27+15+18+27+29+27)/21
= 610 / 21 = 29.05 marks
In the case when, the data includes frequency of the values, the formula changes to
Mean = ∑FiXi / ∑Fi ,
Where, Fi = frequency of the ith value of the distribution,
Xi = ith value of the distribution
8. Merits and Demerits of Arithmetic Mean
Merits :
• It is rigidly defined.
• It is based on all the observation.
• It is also least affected by the fluctuations of sampling.
Demerits :
• It is very much affected by the values at extremes.
• Its value may not coincide with any of the given values.
• It can not be located on the frequency curve like median and
mode nor it can be obtained by inspection
9. MEDIAN
• When all the observation are arranged in ascending or
descending order of magnitude, the data at the middle is
known as the median
10. Merits and Demerits of Median
Merits :
• It can be readily calculated and rigidly defined.
• It can be easily and readily obtained even if the extreme values are not
known.
• Median always remains the same whatsoever method of computation be
applied
Demerits :
• It fails to remain satisfactory average when there is great variation
among the item of population.
• It can not be precisely expressed when it falls between two values.
• It is more likely to be affected by fluctuation of sampling.
11. MODE
• The mode is that value of the variable which occurs most
frequently or whose frequency is maximum.
• Also, if several samples are drawn from a population, the important
value which appear repeatedly in all the sample is called the mode.
12. Merits and Demerits of Mode
Merits :
• It can be obtained simply by inspection.
• Neither the extremes are needed in its computation nor it is
affected by them.
• As it is the item of the maximum frequency, the same item is the
mode in every sample of the population. This is the peculiarity
which is present only in mode and not in any other average.
Demerits :
• In many cases, there is no single and well defined mode.
• When there are more than one mode in the series it becomes
difficult and takes much time to compute it.
13. Weighted Mean
• In the calculation of the arithmetic mean every item is given
equal importance or is equally weighted. But sometimes it so
happens that all the items are not equal importance.
• At that time they are given proper weights according to their
relative importance, and then the average which is calculated
on the basis of these weights is called the weighted average or
weighted mean.
14. Applications of Weighted Mean
It is especially useful in the following cases:-
1. When the number of individuals in different classes of a group
are widely varying.
2. When the importance of all the items in a series is not the same.
3. When the ratios, percentages or rates (e.g. quintals per hectare,
rupees per kilogram, or rupees per meter etc.) are to be averaged.
4. When the means of a series or group is to be obtained from the
means of its component parts.
5. Weighted mean is particularly used in calculating birth rates,
death rates, index numbers, average yield, etc.
19. Measures of Dispersion
• It is quite obvious that for studying a series, a study
of the extent of scatter of the observation of
dispersion is also essential along with the study of
the central tendency in order throw more light on the
nature of the series.
• Simply dispersion (also called variability, scatter, or
spread) is the extent to which a distribution is
stretched or squeezed.
20. Different Measures of Dispersion
• Range
• Mean Deviation
• Standard Deviation
• Variance
• Quartile Deviation
• Coefficient of Variation
21. Range
• Range is the simplest measure of dispersion.
• It is the difference the between highest and the lowest terms of a
series of observations
• Range = XH – XL
Where, XH = Highest variate value and
XL = Lowest variate value
• Its value usually increases with the increase in the size of the
sample.
• It is very rough measure of dispersion and is entirely unsuitable
for precise and accurate studies.
• The only merits possessed by ‘Range’ are that it is (i) simple, (ii)
easy to understand (iii) quickly calculated.
22. Mean Deviation
• The deviation without any plus or minus sign are known as
absolute deviations.
• The mean of these absolute deviations is called the mean
deviation.
• If the deviations are calculated from the mean, the measure
of dispersion is called mean deviation about the mean.
23. Standard Deviation
• Its calculation is also based on the deviations from the arithmetic
mean. In case of mean deviation the difficulty, that the sum of the
deviations from the arithmetic mean is always zero, is solved by
taking these deviation irrespective of plus or minus signs.
• But here, that difficulty is solved by squaring them and taking the
square root of their average.
24. Characteristics and Uses of S.D.
Characteristics :
• It is rigidly defined.
• Its computation is based on all the observation.
• If all the variate values are the same, S.D.=0
Uses :
• It is used in computing different statistical quantities like
regression coefficients, correlation coefficient, etc.
25. Variance
• Variance is the square of the standard deviation.
• Variance= (S. D.)2
• This term is now being used very extensively in the
statistical analysis of the results from experiments.
• The variance of a population is generally represented
by the symbol σ² and its unbiased estimate calculated
from the sample, by the symbol s².
26. Coefficient of Variation
• This is also a relative measure of dispersion, and it is
especially important on account of the widely used measure of
central tendency and dispersion i.e., Arithmetic Mean and
Standard deviation.
• It is given by the formula
• It is expressed in percentage, and used to compare the
variability in the two or more series
28. Calculate the Variance (σ2),
Standard deviation (σ), and
Coefficient of Variation
from the data given:
Class Frequency
2 – 4 3
4 – 6 4
6 – 8 2
8 – 10 1