2. Introduction
Definition
Degree of Freedom
The contingency table
Types of test
Goodness of fit test
Null and alternate hypothesis
Characteristics of chi-square test
Application of chi-square test
Determination of chi-square test with example
Conclusion
Reference
3. The Chi-square test is one of the most commonly used non-
parametric test, in which the sampling distribution of the test
statistic is a Chi-square distribution, when the null hypothesis is
true.
Chi-square test is a useful measure of comparing
experimentally obtained result with those expected theoretically
and based on the hypothesis.
It can be applied when there are few or no assumption about
the population parameter.
It can be applied on categorical data or qualitative data using a
contingency table.
4. A Chi-square statistic is a test that measures how
expectations compare to actual observed to actual
observed data.
Null Hypothesis -Observed=Expected
Alternate Hypothesis- Observed is not equal to Expected
It was first of all used by Karl Pearson in the year 1900.
It is denoted by the Greek sign χ2.
5. Following is the formula-
It is a mathematical expression, representing the ration between
experimentally obtained result(O) & the theoretically expected
result(E) based on certain hypothesis. It uses data in the form of
frequencies (i.e., the number of occurrence of an event).
Chi-square is calculated by dividing the square of the overall deviation
in the observed and expected frequencies by the expected frequency.
6. In test, while the comparing the calculated value of with the table
value, we have to calculated the degree of freedom. The degree of
freedom is calculated from the number of classes.
Therefore, the number of degrees of freedom in a test is equal to the
number of classes/categories minus one.
If there are two classes , three classes, & for classes, the degree of
freedom would be 2-1, 3-1, & 4-1, respectively. In a contingency table,
the degree of freedom is calculated in a different manner:
where r = number of row in table,
c = number of column in a table.
d.f. = (r-1)(c-1)
7. The term CONTINGENCYTABLE was first used by Karl Pearson.
A contingency table is a type in a matrix format that displays the
frequency distribution of the variables.
They are heavily used in survey research, business intelligence,
engineering & scientific research. They provide a basic picture of the
interrelation between two variables and can help find interactions
between them.
The value depends on the number of classes or in on the number of
degrees of freedom & the critical level of probability.
2-2 table when there are only two sample, each divided into classes &
a 2-2 contingency table is prepared. It is also known as Four fold or Four
cell table.
Column 1 Column 2 Row total
Row 1 + + RT 1
Row 2 + + RT2
Column total CT 1 CT 2
Degree of freedom=
(r-1)(c-1)
=(2-1)(2-1)
=1.1
=1
8. Chi-square performs two types of test-
1) Goodness of FitTest (single C variables)
2) The test of independence (between multiple C
variables)
9. The goodness of fit test is a statistical hypothesis test to see how well
sample data fit a distribution from a population with a normal distribution.
EXAMPLE-
We sample and collected data and its comes out below ratio:
Observed Expected (Observed-
Expected)
(Observed-
Expected)2
(Observed-
Expected)2
/E
MALE 13 10 3 9 0.9
FEMALE 7 10 -3 9 0.9
10. Null Hypothesis: ratio of 50-50 exists in office
Alternate Hypothesis in office: ratio of 50:50 is not there.
Males employees<Females employees
Probability of 50:50
Males employees>Female employees
11. oDegree of freedom in our example => Number of classes/categories – 1
=> 2 – 1 =1
oCalculatedChi-square value = 1.8
oCritical value from the chart = 3.8
oIf calculated Chi-square value </= the value of chart . We can not reject
the null hypothesis.
oNull hypothesis – Observed = Expected
oAlternate hypothesis – Observed is not equal to Expected
12. The Chi- square distribution has some important characteristics-
i. This test is based on frequencies, whereas, theoretical distribution the
test is based on mean and standard deviation.
ii. The other distribution can be used for testing the significance of the
difference between a single expected value and observed proportion.
However this test can be used for testing difference between the
entire set of the Expected and the Observed frequencies
iii. A new chi- square distribution is formed for every increase in is the
number of Degree of Freedom.
iv. This rest is applied for testing the hypothesis but is not useful for
estimation.
13. The Chi-square test is applicable for varied problems in agriculture,
biology and medical science-
A. To test the goodness of fit.
B. To test the independence of attributes.
C. To test the homogeneity of independent estimates of the
population variance.
D. To test the detection of linkage.
14. Example-Two varieties of snapdragon, one with red flower and other with white
flower crossed.The result obtained in F2 generation are: 22 red, 52 pink, and 23
white flower plants. Now it is desired to ascertain these figures shows that
segregation occurs in the simple Mendelian ratio of 1:2:1.
Solution-
Null hypothesis-H0: The genes carrying red colour and white colour
characters are segregating in simple Mendelian ratio of 1:2:1.
Expected frequencies-
Red = ¼ . 97=24.25
Pink = 2/4 . 97=48.50
White = ¼ . 97= 24.25
= 5.06/24.25 + 12.25/48.50
+ 1.56/24.25
= 0.21+0.25+0.06
= 0.53 Ans.
Red Pink White Total
Observed
frequency(O)
22 52 23 97
Expected
frequency(E)
24.25 48.50 24.25 97
Deviation(O-E) -2.25 3.50 -1.25
15. The calculated Chi square value( 0.53) is less than the tabulated chi-
square value ( 5.99 ) at 5% level of probability for 2 d.f. . The hypothesis
is, in agreement with the recorded facts.
16. I. Notes provided by subject teacher Mrs. Maya Shedpure.
II. Khan and Khanum, Fundamentals of Biostatistics.
III. Search engine( Google,YouTube, websites )