Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
Próximos SlideShares
Carregando em…5
×

# Chi square goodness of fit test

2.495 visualizações

components of running a chi-square test

• Full Name
Comment goes here.

Are you sure you want to Yes No
• Seja o primeiro a comentar

• Seja a primeira pessoa a gostar disto

### Chi square goodness of fit test

1. 1. Focus FoxA statistically minded toll collector wonders if drivers are equally likely to choose each of the three lanes at his toll booth. He selects a random sample from all the cars that approach the booth when all three lanes are empty, so that the driver’s choice isn’t influenced by the cars already at the booth. Which of the following is the appropriate alternative hypothesis for addressing this question? a. The observed number of cars choosing each lane is equal. b. The observed number of cars choosing each lane is different from the expected number of cars. c. The proportions of cars choosing each of the three lanes are equal. d. The proportions of cars choosing at least one of the lanes is different from the proportion choosing the other two lanes. e. The proportions of cars choosing each of the three lanes are all different. Lane Left Center right Number of drivers 137 159 169
2. 2. Chi-Square Test We still have 3 conditions we must meet: Replacement condition – Large Sample Size condition - all expected counts must be at least 5 Large Sample Size condition takes the place of the Normal condition for z & t procedures Random & Independent must still be met!
3. 3. Chi-Square Test To determine whether a categorical variable has a claimed distribution, perform a chi-square goodness-of-fit test. H0: specified distribution of categorical variable is correct Ha: specified distribution of categorical variable is not correct Or written symbolically using pi for each category: H0: p1 = ____, p2 = ____, p3 = ____, ….. Ha: at least one of the pi’s is incorrect Find expected counts and calculate chi-square statistic χ2 = ∑ (observed – expected)2 Expected P-value is area to the right of χ2 under the density curve of the chi- square distribution with k – 1 degrees of freedom (k represents the number of categories for the variable)
4. 4. Chi-Square Test 3 Conditions: Random – data comes from a random sample or a randomized experiment. Large Sample Size – all expected counts are at least 5 Independent – individual observations are independent. When sampling without replacement, the population is at least 10 as large as the sample (10% condition) Cautions: - Make sure you are comparing counts not proportions - When checking Large Sample Size, make sure to use expected counts
5. 5. Chi-Square Test Are births evenly distributed across the days of the week? The one-way table below shows the distribution of births across the days of the week in a random sample of 140 births from local records in a large city. Do these data give significant evidence that local births are not equally likely on all days of the week? SPDC: (expected counts in Plan, graph in Do) Day: Sun. Mon. Tues. Wed. Thurs. Fri. Sat. Births: 13 23 24 20 27 18 15
6. 6. Chi-Square Test Failing to reject does NOT mean H0 is correct We can use technology to complete the “Do” - Enter observed counts in L1 - Enter expected counts in L2 - STAT over to TESTS - Select χ2 GOF-Test Calculate gives test statistic, df, & P-value Draw will provide appropriate distribution with shading Color Observed Expected Blue 9 14.4 Orange 8 12 Green 12 9.6 Yellow 15 8.4 Red 10 7.8 Brown 6 7.8
7. 7. Chi-Square Test Biologists wish to cross pairs of tobacco plants having genetic makeup Gg, indicating that each plant has one dominant gene G and one recessive gene g for color. Each offspring plant will receive one gene for color from each parent. The Punnett Square shows the possible combinations of genes received by the offspring The Punnett Square suggests that the expected ratio of green GG to yellow-green Gg to albino gg tobacco plants should be 1:2:1. The biologists predict that 25% of the offspring will be green, 50% will be yellow-green, and 25% will be albino. G g G GG Gg g Gg gg Parent1 Parent 2
8. 8. Chi-Square Test To test their hypothesis about the distribution of offspring, the biologists mate 84 randomly selected pairs of yellow-green parent plants. Of 84 offspring, 23 plants were green, 50 were yellow-green, and 11 were albino. Do these data differ significantly from what the biologists have predicted? Carry out an appropriate test at the α = 0.05 level to answer. SPDC: (expected counts in plan, graph in Do)
9. 9. Chi-Square Test If the sample data lead to a statistically significant result, we can conclude that our variable has a distribution different from the specified one. We need a Follow-Up Analysis (the “why”) Steps: - Examine which categories of the variable show large deviations between the observed and expected counts - Look at the terms that sum χ2 - These components show which terms contribute most to the chi-square statistic
10. 10. Chi-Square Test Ex. Tobacco Plant Offspring Biggest contributor?? More or less than expected?? Follow-Up Analysis: The largest contributor to the chi-square statistic is Albino offspring. There were 10 fewer Albino plants than we expected. Offspring Color Observed Expected Green 23 21 Yellow-green 50 42 Albino 11 21