2. Median, mode, mean, range &
standard deviation
•The median is the middle value when the values are
placed in order of size.
•The mode is the most commonly occurring value, the
value that appears the most times/shows the greatest
frequency.
•The mean is the sum of the values divided by the
number of values.
•The range is the difference between the minimum and
the maximum value
•The standard deviation is the measure of the spread of
values around the mean.
3. The range or SD of data is
shown as error bars on a
graphical presentation
4. Calculating stats using the GDC
Player A
x
12
16
10
Finding standard
deviation with a GDC:
In the STAT mode
enter the values into a
list, in this case list1
20
22
17
15
(CALC)
(SET)
Ensure XList: List1
1Var Freq: 1
(1Var)
x
:16
∑x
∑x
:112
2
xσ n
xσ n − 1
n
:1898
: 3.89 Either these values
: 4.20 will be accepted.
:7
Player B
x
7
9
12
31
22
22
9
Use a GDC to find the
standard deviation for
player B.
x : .9
σ8
n3
6. Two data sets can have
the same mean value but
different SDs
7. Comparing two sets of data:
when is a difference significant
and when is it not?
•A difference is NOT regarded as significant when
any differences are due to chance variation.
•In statistics the assumption is initially made that
any differences are due to chance. This is called
the null hypothesis.
•Where the null hypothesis is rejected, a
difference is regarded as significant i.e. the
differences are not just due to chance but to an
actual factor causing the difference.
8. A simple rule to evaluate the significance of
difference between two data sets:
Significant difference unlikely if the standard deviations
are greater than the difference between the means (left
diagram) BUT likely if the standard deviations are
smaller than the difference between the means (right
diagram).
9. Example
In a study of heights, two separate human populations
were sampled:
•Population A had a mean height of 1.65 m and
population B a mean height of 1.72 m.
•SD of population A was 0.09 m and SD of population B
0.1 m.
•Evaluate the data to assess if there is likely to be a
significant difference between the heights of the two
populations.
10. Student’s t-test
•A statistical test to find more reliably if there is a significant
difference between two sets of normally distributed data with
ten or more values.
•You are not expected to calculate the value of t, but the
calculation uses the difference between the means and the size
of the standard deviations.
•The test requires the calculation of degrees of freedom, d.f. =
n1+n2-2 where n is the number of values.
•A table of values is used to find the level of significance using
the t value and degrees of freedom (use the one tailed test).
•If the level of significance is 5% or below, reject the null
hypothesis; if above it is accepted that the differences are due to
chance i.e. no significant difference.
11. Example: a study to compare shell diameters in two different
populations of Periwinkle (a marine mollusc). Use the data below to
carry out a t-test to determine the level of significance.
Population A
Population B
n (number in sample) = 15
n (number in sample) = 12
Mean shell diameter = 1.35 mm
Mean shell diameter = 1.55 mm
Standard deviation = 0.15 mm
Standard deviation = 0.24 mm
12. Types of correlation
(note the out-liers)
v a r ia b le 1
N o C o r r e la tio n
v a r ia b le 2
N e g a tiv e C o r r e la tio n
v a r ia b le 2
v a r ia b le 2
P o s itiv e C o r r e la tio n
v a r ia b le 1
v a r ia b le 1
13. Correlations and causal
relationships
•A causal relationship is one in which one
factor/variable affects another e.g. the extension of a
spring depends on the force applied; an increase in air
humidity causes the transpiration rate to fall.
•A positive or negative correlation implies a causal
relationship but does not prove it.
•Proof of a causal relationship in science often requires
an experiment in which one variable is
manipulated/changed (independent variable) and this is
shown to affect another measured (dependent)
variable.
14. Some examples
•It was found that towns with a greater number of
nesting storks had more children per household than
towns with fewer storks (positive correlation).
•CAN WE THEREFORE CONCLUDE THAT THE
STORKS WERE DELIVERING THE BABIES?
•It has long been known that there is a positive
correlation between the number of cigarettes smoked
and deaths from lung cancer.
•ONLY RECENTLY HAS IT BEEN SHOWN THAT
SMOKING CAUSES LUNG CANCER.
•A strong correlation exists between rise in
atmospheric CO2 levels and rise in global temperature.
•SOME PEOPLE STILL DISPUTE THAT INCREASING
CO2 LEVELS IS WHAT CAUSES GLOBAL WARMING