The document discusses Chi Square distribution and analysis of frequency using Fisher's exact test and McNemar's test. It provides the assumptions, data arrangement, hypotheses, test statistic, decision rules, and example application of both tests for categorical paired and unpaired data. Fisher's exact test is used for 2x2 contingency tables when sample sizes are small, while McNemar's test analyzes paired nominal data to evaluate hypotheses about proportions between pairs.
💚 Punjabi Call Girls In Chandigarh 💯Lucky 🔝8868886958🔝Call Girl In Chandigarh
Chi square(hospital admin)
1. Chi Square distribution and
analysis of frequency
Dr Inn Kynn Khaing
M.B.,B.S, MPH, MSc (Healthcare Administration) (Japan)
Lecturer, Department of Biostatistics
3. The chi-square test is not an appropriate method of analysis if minimum
expected frequency requirements are not met.
For example : if n is less than 20 or if n is between 20 and 40 and one of the
expected frequencies is less than 5, the chi-square test should be avoided.
It is called exact because, if desired, it permits us to calculate the exact
probability of obtaining the observed results or results that are more
extreme.
Assumptions
1. The data consist of A sample observations from population 1 and B
sample observations from population 2.
2. The samples are random and independent.
3. Each observation can be categorized as one of two mutually exclusive
types.
3
4. Data Arrangement
When we use the Fisher exact test, we arrange the data in the form of a
2 x 2 contingency table.
We arrange the frequencies in such a way that A > B and choose the
characteristic of interest so that a/A > b/B. Some theorists believe that
Fisher’s exact test is appropriate only when both marginal totals of Table
are fixed by the experiment.
4
5. 5
Hypotheses
Two-sided
H0: The proportion with the characteristic of interest is the same in
both populations; i.e, p1 = p2
HA: The proportion with the characteristic of interest is not the same
in both populations; p1 ≠ p2
One-sided
H0: The proportion with the characteristic of interest in population 1
is less than or the same as the proportion in population 2; p1 ≤ p2.
HA: The proportion with the characteristic of interest is greater in
population 1 than in population 2; p1 > p2.
6. 6
Test Statistic
The test statistic is b, the number in sample 2 with the characteristic of
interest.
Decision Rule
Two-sided test
If the observed value of b is ≤ the integer in a given column, reject H0 at a
level of significance equal to twice the significance level shown at the top
of that column.
For example, suppose A = 8, B = 7, a = 7, and the observed value of b is 1.
We can reject the null hypothesis at the 2(.05)=.10, the 2(0.025)= 0.05, and
the 2 (0.01)=0.02 levels of significance, but not at the 2 (0.005)=.01 level
7. 7
Decision Rule
2. One-sided test
If the observed value of b is less than or equal to the integer in a given
column, reject H0 at the level of significance shown at the top of that
column.
For example, suppose A = 16, B = 8, a = 4, and the observed value of b is 3.
We can reject the null hypothesis at the .05 and .025 levels of significance,
but not at the .01 or .005 levels.
8. The purpose of a study was to evaluate the long-term efficacy of taking
indinavir/ritonavir twice a day in combination with two nucleoside reverse
transcriptase inhibitors among HIV-positive subjects who were divided into two
groups. Group 1 consisted of patients who had no history of taking protease
inhibitors. Group 2 consisted of patients who had a previous history taking a
protease inhibitor. Table shows whether these subjects remained on the
regimen for the 120 weeks of follow-up.
We wish to know if we may conclude that patients classified as group 1 have a
lower probability than subjects in group 2 of remaining on the regimen for 120
weeks.
8
9. Solution:
1. Data. The data are rearranged to conform to the layout of given table.
Remaining on the regimen is the characteristic of interest.
2. Assumptions. We presume that the assumptions for application of the Fisher
exact test are met.
9
We arrange the frequencies in such a way that A > B and
choose the characteristic of interest so that a/A > b/B.
10. Solution:
3. Hypotheses.
H0: The proportion of subjects remaining 120 weeks on the regimen in a
population of patients classified as group 2 is the same as or less than the
proportion of subjects remaining on the regimen 120 weeks in a population
classified as group 1.
HA: Group 2 patients have a higher rate than group 1 patients of remaining
on the regimen for 120 weeks.
4. Test statistic. The test statistic is the observed value of b as shown in table.
5. Distribution of test statistic. We determine the significance of b by
consulting Appendix Table J.
6. Decision rule. Suppose we let α=.05. The decision rule, then, is to reject H0 if
the observed value of b is equal to or less than 1, the value of b in Table J for
A = 12, B = 9, a = 8, and α=.05.
10
11. Solution:
7. Calculation of test statistic. The observed value of b, as shown in table is 2.
8. Statistical decision. Since 2 > 1, we fail to reject H0.
9. Conclusion. Since we fail to reject H0, we conclude that the null hypothesis
may be true. That is, it may be true that the rate of remaining on the
regimen for 120 weeks is the same or less for the PI experienced group
compared to the PI naive group.
10. p value. We see in Table J that when A = 12, B = 9, a = 8, the value of b = 2
has an exact probability of occurring by chance alone, when H0 is true,
greater than .05. p>0.05
11
12. Test for paired nominal data
When categorical data are paired, the McNemar test is the appropriate test
When data are paired and the outcome of interest is a proportion, the
McNemar Test is used to evaluate hypotheses about the data.
Developed by Quinn McNemar in 1947
Sometimes called the McNemar Chi-square test because the test statistic
has a Chi-square distribution
1st subject of pair
Variable 1
Variable 2 0 1 Total
2nd subject of pair
0 e f
e + f
1 g h
g + h
Total e + g f + h n 12
13. Example:
Over a period of a 6 months the registrar selected every patient with this
disorder and paired them off as far as possible by reference to age, sex,
and frequency of ulceration. Finally she had 108 patients in 54 pairs. To
one member of each pair, chosen by the toss of coin, she gave treatment
A, which she and her colleagues in the unit had hitherto regarded as the
best; to the other member she gave the new treatment, B. Both forms of
treatments are local applications, and they cannot be made to look alike.
Consequently to avoid bias in the assessment of the results a colleague
recorded the results of treatment without knowing which patient in each
pair had which treatment.
13
14. Member of pair
receiving treatment A
Member of pair
receiving treatment B
Pairs of patients
Responded Responded 16
Responded Did not respond 23
Did not respond Responded 10
Did not respond Did not respond 5
Total 54
1st subject of pair
Responded Did not respond
2nd subject of
pair
Responded 16 10
Did not respond 23 5
Total e + g f + h 14
15. 1st subject of pair
Responded Did not respond
2nd subject of
pair
Responded 16 (e) 10 (f)
Did not respond 23 (g) 5 (h)
X2 = (f-g)2/ (f + g) with 1 d.f
X2 = (10-23)2/ (10 + 23) = 5.12
Or with a continuity correction
X2 = ( ǀf-gǀ -1 )2/ (f + g) with 1 d.f
X2 = ( ǀ10-23ǀ -1 )2/ (10 + 23) = 4.36
Both X2 values 0.02 < P < 0.05
Conclusion:
Treatment A gave significantly
better results than treatment B.
15
16. The sampling distribution of the McNemar statistic is a Chi-square distribution.
For a test with alpha = 0.05, the critical value for the McNemar statistic = 3.84.
The null hypothesis is not rejected if the McNemar statistic < 3.84.
The null hypothesis is rejected if the McNemar statistic > 3.84.
16
17. For ( r x 1) or (1 x c) table : df = k-1 where k = number of categories
For many assumptions : df = k=r where r = number of restrictions
For (r x c) table : df = (r-1) (c-1) where r = rows, c = column
The most simple and commonly applied table is 2 x 2 table which has two rows
and columns.
17
19. 4. In a study of relationship between gender and smoking habit, 100 couples had been
interviewed. After controlling age and socioeconomic status, it was found that
Both male and female smokers 24 couples
Neither male nor female smokers 40 couples
Male but Not female smokers 26 couples
Not male but female smokers 10 couples
Construct a cross-table from the above data.
Analyze these data by using McNemar’s test.
Find 95% confidence limits of the difference in proportions.
19