3. HISTORY
• The term statistical significance was coined
by Ronald Fisher(1890-1962).
• Student t-test : William Sealy Gosset.
4. Introduction to Statistical inference
• Statistical inferences are based on probabilities and as such cannot be
expressed with full certainty.
• When a test shows that a difference is statistically significant, then it simply
suggests that the difference is probably not due to chance.
5.
6. • point estimation
In this, a single point (called statistic) is obtained from the sample value to
estimate the value of the population parameter without going into the error.
This estimate usually does not coincide with the true parameter.
• eg: sample standard deviation is a point estimate of population standard
deviation.
Interval estimation
A range of values or an interval is obtained which may be expected to cover
the true value of the parameter with some definite probability.
7. NULL HYPOTHESIS(H0)
• This step asserts that there is no real difference in the sample and the
population under consideration and the difference is accidental.
• Example- “ There is no difference in the DMF scores of the rural and urban
children”.
8. Alternative hypothesis(H1)
• If our sample do not support this null hypothesis, we should conclude
that something else is true. What we conclude rejecting null
hypothesis…
• Example- “there is a difference in the DMF scores of the rural and
urban children”.
9. Tests of significance???
• Test of significance is a formal procedure for comparing observed
data with a claim (also called a hypothesis) whose truth we want to
assess.
• A significance test uses data to evaluate a hypothesis by
comparing sample point estimates of parameters to values predicted
by the hypothesis.
10. Have the observations changed with time and/or intervention?
Do two or more groups of observations differ from each other?
Is there an association between different observations?
Why tests of significance??
11. Stages in performing a test of significance:
1. State the null hypothesis of no or chance difference and the alternative.
2. Determine P , i.e., probability of occurrence of your estimate by chance
or simply accepting or rejecting your null hypothesis
3. Draw conclusion on the basis of P value, i.e., decide whether the
difference observed is due to chance or play of some external factors on
the sample.
12. Various tests of significance
NON
PARAMETRIC
Qualitative Quantitative
14. Parametric Tests Non parametric Tests
•Use the actual values
•Can not be used when very
less Sample is present
•Use only order or grades not
actual values
•Can be used when Very few
sample is avail
15. Rank sum
test a) Mann Whitney test (U test)
b) Kruskal Wallis test (H
16. Selection of the test
• Based on:
1. Type of data
2. Size of sample
3. Number of samples
17.
18. • Used for Quantitative Data
• Used for continuous variables
PARAMETRIC TESTS
19. When are non-parametric tests used?
• Assumptions of parametric test are violated
• Non-normal or skewed
• Data is on an ordinal scale
• Very few observations
20. STUDENT’S T-TEST
Developed by Prof W.S Gossett in 1908, who published statistical
papers under the pen name of ‘Student’. Thus the test is known as
Student’s ‘t’ test.
Indications for the test:-
1. When samples are small
2. Population variance are not known.
21. 1. Two means of small independent samples
2. Sample mean and population mean
3. Two proportions of small independent samples
Uses
22. 1. Samples are randomly selected
2. Data utilised is Quantitative
3. Variable follow normal distribution
4. Samples are small, mostly lower than 30
Assumptions made in the use of ‘t’ test
23. A t-test compares the difference between two means of different
groups to determine whether that difference is statistically
significant.
Student’s ‘t’ test for different purposes
‘t’ test for one sample
‘t’ test for unpaired two samples
‘t’ test for paired two samples
24. ONE SAMPLE T-TEST
When compare the mean of a single group of observations with
a specified value.
25. Unpaired Two sample ‘t’- test
• Unpaired t- test is used when
we wish to compare two means.
• Used when the two independent
random samples come from the
normal populations having
unknown or same variance.
Used when we have paired data of
observations from one sample only, when
each individual gives a pair of
observations.
Same individuals are studied more than
once in different circumstances-
measurements made on the same people
before and after interventions
Eg: comparison of hb values of persons
estimated by 2 methods
Two Sample ‘t’ test
Paired two sample ‘t’ - test
26. Eg: the mean values of birth weight with SDs and sample sizes are given below by
socio economic status. Is the mean diff in birth weight signi. Btwn socioeco grps.?
Random samples were taken for the study.
Sl.no details High socioeconomic group Low socioeconomic group
1 Sample size n1 = 15 n2 = 10
2 Birth weight x1 = 2. 91 X2 = 2.26
3 Standard deviation SD1 =0.27 SD2 = 0.22
Null hypothesis(H0) = x1 and x2 are same
Alternative hypothesis : H1: H0 is not true
Samples are small. Variances are (SD1² = SD2²) mostly same. t test applied.
27. t =
𝑥1− 𝑥2
𝑆𝐸 𝑥,− 𝑥2
{SE( 𝑥1 − 𝑥2) = √ 𝑛1 − 1 𝑆𝐷1² + (𝑛2 − 1) 𝑆𝐷2
2
[
1
𝑛1
+
1
𝑛2
] = 0.1027}
t = 2.91 - 2.26 = 6.329
0. 1027
DF = (𝑛1−1) + (𝑛2 − 1)= 23
For DOF 23, t0.001 value is 3.767 from the table.
Calculated t is greater than t0.001, So null rejected, alternate accepted.
28. EXAMPLE 2:
• A study was carried to evaluate the effect of the new diet on weight loss. The
study population consist of 12 people have used the diet for 2 months; their
weights before and after the diet are given
Patient no. Weight (Kgs)
Before Diet After Diet
1 75 70
2 60 54
3 68 58
4 98 93
5 83 78
6 89 84
7 65 60
8 78 77
9 95 90
10 80 76
11 100 94
29. SD(d)/ 𝑛
t =
d
FORMULA
d = difference between x1 and x2
𝑑 = Average of d
SD = Std. deviation for the difference
n = sample size
30. • This test is used for testing significance difference between two
means (n>30).
• Assumptions to apply Z test
The sample must be randomly selected
Data must be quantitative
Samples should be larger than 30
Data should follow normal distribution
Sample variances should be almost the same in both the
groups of study
Z Test
31. To compare sample mean with population mean
To compare two sample means
To compare sample proportion with population proportion
To compare two sample proportions
Indications for Z Test
32. One tailed and Two tailed Z tests
• Z values on each side of mean are calculated as +Z or as -Z.
• A result larger than difference between sample mean will give +Z
and result smaller than the difference between mean will give -Z
33. One tailed Two tailed
If we want to know whether
one particular drug is better than
the other
The result will lie at one end or
tail of the distribution
If we want to know the action
of a particular drug is different
from that of another
The p value of an experimental
group includes both sides of
extreme results at both ends of
scale
34. • Find Z value
Z = Observed mean - Mean
Standard Error
• Compare calculated Z value with the value in Z table at corresponding
degree significance level.
If the observed Z value is greater than theoretical Z value, Z is significant,
reject null hypothesis and accept alternate hypothesis
35. Details american indian
No. of subjects 625 625
Mean 20.5 15.5
Standard deviation 5.0 5.4
Example:
Details of study on arm circumference (cm) of American and indian preschool children are given
below. Can we infer that arm circumference is different between American and indian
children???
Null hypothesis: there is no diff between means
Alt hypothesis: there is diff in means
Z =
𝑥1− 𝑥2
(𝑆𝐸1
2+𝑆𝐸2 )
2
36. Standard error of ( ) = (𝑆𝐸1
2
+ 𝑆𝐸2 )
2
= [SD1² SD2 ²]
+
[ n1 n2 ]
= [5² (5.4)² ]
+
625 625
= 0.2947
Z =
𝑥1 − 𝑥2
(𝑆𝐸1
2
+ 𝑆𝐸2 )
2
= 16.985
This value is GREATER than Z (0.001)= 3.29, taken from the table.
Hence, difference is significant. Null hypothesis is rejected.
37. • Given by Sir Ronald Fisher
• The principle aim of statistical models is to explain the variation
in measurements.
• The statistical model involving a test of significance of the
difference in mean values of the variable between two groups is
the student’s,’t’ test. If there are more than two groups, the
appropriate statistical model is Analysis of Variance (ANOVA)
Analysis of Variance(ANOVA)
38. 1. Sample population can be easily approximated to normal distribution.
2. All populations have same Standard Deviation.
3. Individuals in population are selected randomly.
4. Independent samples
Assumptions for ANOVA
39. • ANOVA compares variance by means of a simple ratio, called F-
Ratio
F= Variance between groups
Variance within groups
• The resulting F statistics is then compared with critical value of
F (critic), obtained from F tables in much the same way as was
done with ‘t’
• If the calculated value exceeds the critical value for the
appropriate level of sig., the null hypothesis will be rejected.
40. • A F test is therefore a test of the Ratio of Variances. F Tests can
also be used on their own, independently of the ANOVA
technique, to test hypothesis about variances.
• In ANOVA, the F test is used to establish whether a statistically
significant difference exists in the data being tested.
• ANOVA can be
ANOVA
ONEWAY
TWO
WAY
41. One Way ANOVA
If the various experimental groups differ in terms of only one
factor at a time- a one way ANOVA is used
e.g. A study to assess the effectiveness of four different antibiotics
on S Sanguis
42. Two Way ANOVA
If the various groups differ in terms of two or more factors at a
time, then a Two Way ANOVA is performed
e.g. A study to assess the effectiveness of four different antibiotics
on S Sanguis in three different age groups
43. ANCOVA (Analysis of covariance )
• If the independent variables are categorical type then ANOVA is used.
• However if some of the independent variables are categorical and some are
continuous, then ANCOVA is appropriate.
• Example: In a study in which goal was to test the effects of anti hypertensive
drugs on systolic blood pressure ( a continuous dependent variable) and the
independent variables were age group (continuous) and treatment (a
categorical variable ).
44. 1. To adjust the source of bias in observational studies, and also
to remove the effect of disturbing variables
2. To study regression in multiple classifications
Indications
45. • ANCOVA is an another ANOVA technique which combines the ANOVA with
regression to measure the differences among group means
• The advantages that ANCOVA has over other techniques are:
1. The ability to reduce the error variance in the outcome measure.
2. The ability to measure group differences after allowing for other differences
between subjects.
46. CORRELATION
• Relationship or association between two quantitatively measured or
continuous variables
• Eg : Height and weight , temperature and pulse , age and vital capacity , etc..
• The extent of relationship of two quantitative variables is measured by
Pearson’s correlation coefficient. It is denoted by letter ‘r’.
47. • Pearson’s Correlation Coefficient(r) is a measure of the strength of
the association between the two variables.
Pearson’s Correlation Coefficient
1. Subjects selected for study with pair of values of X & Y are
chosen with random sampling procedure.
2. Both X & Y variables are continuous and are assumed to follow
normal distribution
Assumptions Made in Calculation of ‘r’
48. The first step in studying the relationship between two continuous variables is to
draw a scatter plot of the variables to check for linearity.
Steps
However, conventionally, the independent variable is plotted on X axis and
dependent variable on Y-axis
The nearer the scatter of points is to a straight line, the higher the strength of
association between the variables.
50. REGRESSION
• statistical testing between two or more variables.
• In simple regression, we have only two variables, one variable (defined as
independent) is the cause of the behavior of another one (defined as dependent
variable).
• Regression means change in the measurements of one variable, on the positive
or negative side beyond the mean.
• Regression coefficient (b) is a measure of the change in one dependent
character (Y) with one unit change in the independent character (X).
51. • Correlation gives the idea about the degree and direction of relationship
between two variables , where as regression analysis enables us to predict
the values of one variable on the basis of other variable.
52. • Chi-square is an important continuous probability distribution, first
formulated by Helmert and then developed by Karl Pearson.
• This test is also for testing qualitative data.
• Its advantage over Z test is:
– Can be applied for smaller samples as well as for large samples.
• Prerequisites for Chi square (X2) test to be applied:
– The sample must be a random sample
– None of the observed values must be zero.
– Adequate cell size
Chi square (X2) test
53. 1. Make a contingency table mentioning the frequencies in all cells.
2. Determine the expected value (E) in each cell.
3. Calculate the difference between observed and expected values in each cell (O-E).
Steps in Calculating (X2) value
54. 4. Calculate X2 value for each cell
5. Sum up X2 value of each cell to get X2 value of the table.
6. Find out p value from x2 table
7. If p > 0.05 , difference is not significant, null hypothesis accepted;
If p < 0.05 , difference is significant , null hypothesis rejected.
55. • Consider a study done in a hospital where cases of breast cancer were compared
against controls from normal population against with a family history of Ca Breast.
• 100 in each group were studied for presence of family history.
• 25 of cases and 15 among controls had a positive family history.
• Comment on the significance of family history in breast cancer.
Example
56. • From the numbers, it suggests that family history is 1.66 (25/15) times more common in
Ca breast.
• So is it a risk factor in population?
• We need to test for the significance of this difference.
• We shall apply X2 test.
57. Step – 1: Set up a null hypothesis
– H0: “There is no significant difference between incidence of family history among
cases and controls.”
Step – 2: Define alternative hypothesis
– Ha: “Family history is 1.66 times more common in Ca breast”
Solution
58. 1. Make a contingency table mentioning the frequencies in all cells
Step – 3: Calculating X2
61. Step – 4: Determine degree of freedom.
• DoF is determined by the formula:
DoF = (r-1) x (c-1)
where r and c are the number of rows and columns
respectively
• Here, r = c = 2.
• Hence, DoF = (2-1) x (2-1) = 1
62. Step – 5: Find out the corresponding P Value
– P values can be derived by using the X2 distribution tables
63. Step – 6: Accept or reject the Null hypothesis
• In given scenario,
X2 = 3.125
• This is less than 3.84 (for P = 0.05 at dof =1)
• Hence Null hypothesis is Accepted, i.e.,
“There is no significant difference between incidence of family history among
cases and controls”
64. The Sign test (for 2 repeated/correlated
measures)
• Sign test is used to find out the statistical significance of differences in matched
pair comparisons.
• The sign test is one of the simplest nonparametric tests.
• Its based on + or – signs of observations in a sample and not on their numerical
magnitudes.
• For each subject, subtract the 2nd score from the 1st, and write down the sign of
the difference.
• Test the null hypothesis that these + and – signs are values of a random variable.
65. Wilcoxon Matched- Pairs Signed-Ranks Test
• Used to compare the output of two objects or where subjects are
studied in context before or after experiment.
• Used to determine both direction and magnitude of difference between
matched values.
• This test is analogous to paired t- test.
66. • Procedure-
we first find the difference (d) between each pairs and assign
ranks to each difference and the test statistics T is calculated
• In tie situation, we assign ranks to such pairs by averaging their
ranks.
67. Example
• The duration of endurance of pain by eleven mice before and after administration of
a drug(adrenaline 0.4mg/20gm body-weight) . Is there a sufficient evidence in the
data to say that the drug increases the duration of endurance of pain?
BEFORE
DRUG(1)
AFTER DRUG(2) DIFF.
(2-1)
RANK OF DIFF. RANKS WITH
SIGNS
15.5
12.7
14.8
16.7
20.1
22.0
20.2
18.1
17.6
17.4
19.1
21.2
20.1
17.2
22.7
20.0
19.8
19.8
18.8
17.9
24.3
18.6
+5.7
+7.4
+2.4
+6.0
-0.1
-2.2
-0.4
+0.7
+0.3
+6.9
-0.5
8
11
7
9
1
6
3
5
2
10
4
+ 8
+11
+7
+9
-1
-6
-3
+5
+2
+10
-4
68. • Sum of the negative ranks = 14
• Sum of the positive ranks = 52
• The null hypothesis that the drug has no effect is tested using the smaller of the
sums of positive and of negative ranks, which in this case is 14.
• The table value of T at five percent level of significance when n=11 is 11
• The value of T in our example (14) being more than 11, we reject the null
hypothesis.
69. Mc Nemer Test
• One of the important non parametric tests often used when the data happen to be
nominal and relate to two related samples.
• This test is useful with before-after measurement of the same subjects
• This test tries to judge the significance of any observed change in views of the same
subject before and after the treatment.
71. Rank Sum Tests
• Commonly used rank sum tests includes-
the U Test and
the H Test
U Test is popular known as Wilcoxon-Mann-Whitney test,
H Test is also known as Kruskal Wallis Test
72. Mann- Whitney Test or (U test)
• This test is used to determine whether two independent samples have been drawn
from the same population
• It analyses the degree of separation (or the amount of overlap) between the
Experimental and Control groups.
• It is an alternative to students t-test
• It requires at least an ordinal or normal level of measurements
73. Mann Whitney U-test formula is given by
n1(n1+1)
U= n1n2 + --------- - R1 or R2
2
Where n1, n2 are sample sizes,
R1 and R2 are sum of ranks assigned to I and II groups
74. Procedure
All the observation in two samples are ranked
numerically from smallest to largest without regard the
groups
Then identify the observation for I and II samples
Sum of ranks for I and II sample determined separately
Take difference of two sum T =R1 - R2
75. Data for two independent groups
sample 1: 53,38,69, 57,46,39,73,48,73,74,60 and 78
sample 2: 44,40,61,52,32,44,70,41,67,72,53 and 72
test at 10% level of hypothesis that they come from population with same mean.
Example
76. U = n1.n2+n1(n1+1) - R1
2
= 12.12+ 12(12+1) - 167.5 = 144+78-167.5 =
54.5
2
ascending
order
32
38
39
40
41
44
44
46
48
52
53
53
57
60
61
67
69
70
72
72
73
73
74
1
2
3
4
5
6.5
6.5
8
9
10
11.5
11.5
13
14
15
16
17
18
19.5
19.5
21.5
21.5
23
B
A
A
B
B
B
B
A
A
B
B
A
A
A
B
B
A
B
B
B
A
A
A
Illustration of direct calculation of the U statistic
R1 = 2+3+8+9+11.5+13+14+17+21.5+21.5+23+24 = 167.5
R2 = 1+4+5+6.5+6.5+10+11.5+15+16+18+19.5+19.5 = 132.5
77. µu = n1. n2 = 12 . 12 = 72
2 2
σu =√n1.n2 (n1+n2+1) ∕12
=√12.12 (12+12+1) ∕ 12
= 17.32
• Upper limit =µƲ + 1.64. σƲ = 72+1.64(17.32)= 100.40
• Lower limit = µƲ – 1.64. σƲ = 72-1.64( 17.32)=43.60
• as the observed value is 54.5 which is in the acceptance region, null
hypothesis is accepted.
78. Kruskall Wallis Test (or H test )
• This test is used to test the null hypothesis that ‘k’ independent random
samples come from identical universe against the alternative hypothesis.
• This test is analogous to the one way- ANOVA.
• It does not require assumption that the samples come from normal
populations or the universe having the same standard deviation.
where N= n1+n2……+nk
79. Example
• Use the kruskal – wallis test at 5% level of significance to test the null hypothesis that a profession bowler
performs equally well with the four bowling balls.
bowling results in five games
with ball no. A 271 282 257 248 262
with ball no .B 252 275 302 268 276
with ball no .C 260 255 239 246 266
with ball no .D 279 242 297 270 258
80.
81. Bowling results with diff. balls and corresponding ranks
H = 12 [ 522 + 372 +752 +462 ] - 3(20+1)
20(20+1) 5 5 5 5
=(0.0285)(2362.8)-63
=67.51-63
= 4.51
82. • The null hypothesis that the bowler performs equally well with the four balls , x2
= 7.82 for (k-1) or 4-1= 3 degrees of freedom a 5% level of significance.
• since the calculated value of H is only 4.15 and does not exceed the x2
value, Null hypothesis is accepted.
83. Spearman’s rank correlation
coefficient
• Developed by Charles Spearman.
• Rank correlation is a measure of correlation that exists between two sets of ranks.
• The procedure consists of ranking the two sets of values X and Y, and computing
the difference d of each pair. The d’s are then squared and added.
6Σd2 d= difference in ranks
rs = 1 - n= number of pairs
n(n
2
-1)
85. rs = 1 - 10 x 99 = 1 - 0.339
= 0.661
referring the table, for n=10, at 5% level of significance we find the value greater than the
table value.
hence the null hypothesis is rejected and it is concluded that fasting blood sugar and
systolic blood pressure are correlated.
6 x 56
86. Kendall’s coefficient of concordance
• It is represented by symbol W.
• It determines the degree of association among several (k) sets of ranking of N
objects or individuals.
• There are only two sets of ranking N objects, we generally work out
Spearman’s coefficient of correlation, but Kendall’s coefficient of concordance
(W) is considered an appropriate measure of studying the degree of association
among three or more sets of rankings.
Ws = s s= Σ(Rj-Rj)2
1 K2 ( N3-N)
12
87. Example
Seven individuals have been assigned ranks by four judges at a certain music competition.
Is there a significant agreement in ranking assigned by different judges? Test at 5 % level.
88. Solution
Rj = ΣR j / N = 112/7 = 16
s= 332 , W= S = 332 = 332 = 332 = 0.741
1 K2 ( N3-N) 1 (4)2( 73-7) 16 (336) 448
12 12 12
Null hypothesis is rejected , there is significant agreement in ranking by different judges at 5% level of
significance
89. Limitations of tests of significance
Testing of hypothesis is not decision making itself; but it helps for decision making
Test does not explain the reasons as why the difference exists, tests do not tell about
the reason causing the difference.
Tests are based on the probabilities and as such can not be expressed with full
certainty.
Statistical inferences based on the significance tests can not be said to be
entirely correct evidences concerning the truth of the hypothesis.
91. REFERENCES
• Rao KV. Biostatistics, A manual of statistical method for use in Health, Nutrition and
Anthropology. Jaypee Publications
• Mahajan BK. Methods in biostatistics. 7 th edition. Jaypee publications
• Text book of Research methodology by C.R.Kothari.
• K. Park. Park’s text book of Preventive and Social Medicine. 19th edition.
• Essentials of preventive and community dentistry, 3rd edition by Soben Peter.
Notas do Editor
Statistical inference is the branch of statistics which is concerned with using probability concept to deal with uncertainty in decision making.
It is the process of drawing various conclusions about the population characteristics (called parameter) from the sample values.
The estimates differ for different samples drawn from the same population- this difference is called as sampling variability
It represents the hypothesis we are trying to reject.
It is usually the one we wishes to prove.
It represents all other possibilities of a hypothesis
P value are used to assess the degree of dissimilarity between two or more sets of measurements or between one set of measurements and a standard. If the calculated value is greater than the given p value, null hypothesis is rejected, alt hypothesis is accepted.
PARAMETRIC TESTS –depends on the parameter or parametric characteristics
Values are independent
Normally distributed
Equal variances- population
Measured @ interval level hence use of arithmetic operation
NON- PARAMETRIC TESTS- when conditions of parametric are not met.
Size of sample- small
Normality of distribution- doubtful
Measurement- ordinal or nominal form
Qualitative- information that cant actually be measured
Eg; colour of eyes.
Quantitative data- information can be measured and are written down wd nos.
Eg sie, length, height
Used when data are measured on approximate interval or ratio scales of measurement.Data should follow normal distribution
Variance very high in relative to mean
Sample variances are mostly same in both the groups under the study
In one sample t-test, we know the population mean. We draw a random sample from the population and then compare the sample mean with the population mean and make a statistical decision as to whether or not the sample mean is different from the population
Sd varience of first sample
Se std error
N sample size
X1- x2 diff between 2 means
The degree of freedom is the number of independent members in the sample
If the calculated value is greater than the given p value, null hypothesis is rejected, alt hypothesis is accepted.
If the SD of the populations is known, a Z test can be applied even if the sample is smaller than 30
-1 ≤ r ≤ +1
The correlation coefficient should not be calculated of the relationship is not linear
For correlation only purposes, it does not matter on which axis the variables are plotted
Types of regression anlysis includes:
Linear
Logistic polynomial ,stepwise, ridge , lasso (least absolute shrinkage and selected operator)
Elastic net
If the dependent variable and all of the independent variables are continuous, the correct type of multivariable analysis is multiple linear regression
Substract o – e followed by taking the square of the same and dividing it by expected value gives the x2 value of the table.
As we are to test the null hypothesis that the two archeaologists x and y are equally good and if that is so, the no of pluses and minuses should be equal and as such p = ½ and q= ½ . Hence, the std error of prop of success, given the null hypothesis and size of sample, we have
Mean and the standard deviation. Z value for 0.45 area of curve is 1.64
Based on the area under normal curve at 10% level of significance, the upper and lower limits are assigned.
K no of judges
N no. of objects ranked
Rj sum of ranks
But the worked out value is 332, which is higher than table value, which shows that 0.741 is significant. hence , we reject null hypothesis.
Choice of an appropriate statistical test is very important as it decides the fate of outcome of the study.
They are only the tools for analyzing data & should never be used as a substitute for knowledgeable interpretation of outcomes.