Enviar pesquisa
Carregar
The Chi Square Test
•
13 gostaram
•
15,935 visualizações
M
Max Chipulu
Seguir
Using the chi-square test
Leia menos
Leia mais
Negócios
Educação
Esportes
Denunciar
Compartilhar
Denunciar
Compartilhar
1 de 25
Baixar agora
Baixar para ler offline
Recomendados
Chi square test
Chi square test
Nayna Azad
Chi square test final
Chi square test final
Har Jindal
Z-test
Z-test
femymoni
One sided or one-tailed tests
One sided or one-tailed tests
Hasnain Baber
Chi square mahmoud
Chi square mahmoud
Mohammad Ihmeidan
hypothesis testing-tests of proportions and variances in six sigma
hypothesis testing-tests of proportions and variances in six sigma
vdheerajk
F Distribution
F Distribution
jravish
Hypothesis
Hypothesis
Nilanjan Bhaumik
Recomendados
Chi square test
Chi square test
Nayna Azad
Chi square test final
Chi square test final
Har Jindal
Z-test
Z-test
femymoni
One sided or one-tailed tests
One sided or one-tailed tests
Hasnain Baber
Chi square mahmoud
Chi square mahmoud
Mohammad Ihmeidan
hypothesis testing-tests of proportions and variances in six sigma
hypothesis testing-tests of proportions and variances in six sigma
vdheerajk
F Distribution
F Distribution
jravish
Hypothesis
Hypothesis
Nilanjan Bhaumik
Comparing means
Comparing means
University of Jaffna
Student t-test
Student t-test
Steve Bishop
Chi square test
Chi square test
Anandapadmanabhan Kottiyam
Chi square
Chi square
PoojaVishnoi7
Review & Hypothesis Testing
Review & Hypothesis Testing
Sr Edith Bogue
inferencial statistics
inferencial statistics
anjaemerry
Kruskal wallis test
Kruskal wallis test
YASMEEN CHAUDHARI
T test statistics
T test statistics
Mohammad Ihmeidan
Hypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-test
Shakehand with Life
Hypothesis testing
Hypothesis testing
Shameer P Hamsa
Inferential Statistics
Inferential Statistics
ewhite00
Descriptive statistics
Descriptive statistics
Attaullah Khan
Testing of hypotheses
Testing of hypotheses
RajThakuri
Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)
Harve Abella
The two sample t-test
The two sample t-test
Christina K J
Parametric vs Nonparametric Tests: When to use which
Parametric vs Nonparametric Tests: When to use which
Gönenç Dalgıç
Testing of hypothesis
Testing of hypothesis
RuchiJainRuchiJain
Sign test
Sign test
sukhpal0015
Statistical inference: Estimation
Statistical inference: Estimation
Parag Shah
Hypothesis testing an introduction
Hypothesis testing an introduction
Geetika Gulyani
Chi square analysis
Chi square analysis
Shameer P Hamsa
Chi square test
Chi square test
Patel Parth
Mais conteúdo relacionado
Mais procurados
Comparing means
Comparing means
University of Jaffna
Student t-test
Student t-test
Steve Bishop
Chi square test
Chi square test
Anandapadmanabhan Kottiyam
Chi square
Chi square
PoojaVishnoi7
Review & Hypothesis Testing
Review & Hypothesis Testing
Sr Edith Bogue
inferencial statistics
inferencial statistics
anjaemerry
Kruskal wallis test
Kruskal wallis test
YASMEEN CHAUDHARI
T test statistics
T test statistics
Mohammad Ihmeidan
Hypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-test
Shakehand with Life
Hypothesis testing
Hypothesis testing
Shameer P Hamsa
Inferential Statistics
Inferential Statistics
ewhite00
Descriptive statistics
Descriptive statistics
Attaullah Khan
Testing of hypotheses
Testing of hypotheses
RajThakuri
Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)
Harve Abella
The two sample t-test
The two sample t-test
Christina K J
Parametric vs Nonparametric Tests: When to use which
Parametric vs Nonparametric Tests: When to use which
Gönenç Dalgıç
Testing of hypothesis
Testing of hypothesis
RuchiJainRuchiJain
Sign test
Sign test
sukhpal0015
Statistical inference: Estimation
Statistical inference: Estimation
Parag Shah
Hypothesis testing an introduction
Hypothesis testing an introduction
Geetika Gulyani
Mais procurados
(20)
Comparing means
Comparing means
Student t-test
Student t-test
Chi square test
Chi square test
Chi square
Chi square
Review & Hypothesis Testing
Review & Hypothesis Testing
inferencial statistics
inferencial statistics
Kruskal wallis test
Kruskal wallis test
T test statistics
T test statistics
Hypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-test
Hypothesis testing
Hypothesis testing
Inferential Statistics
Inferential Statistics
Descriptive statistics
Descriptive statistics
Testing of hypotheses
Testing of hypotheses
Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)
The two sample t-test
The two sample t-test
Parametric vs Nonparametric Tests: When to use which
Parametric vs Nonparametric Tests: When to use which
Testing of hypothesis
Testing of hypothesis
Sign test
Sign test
Statistical inference: Estimation
Statistical inference: Estimation
Hypothesis testing an introduction
Hypothesis testing an introduction
Destaque
Chi square analysis
Chi square analysis
Shameer P Hamsa
Chi square test
Chi square test
Patel Parth
The chi square_test
The chi square_test
Anandapadmanabhan Kottiyam
K wtest
K wtest
lydiabenson
Chi square
Chi square
Kevin Chun-Hsien Hsu
Aron chpt 11 ed (2)
Aron chpt 11 ed (2)
Sandra Nicks
Reporting chi square goodness of fit test of independence in apa
Reporting chi square goodness of fit test of independence in apa
Ken Plummer
non parametric statistics
non parametric statistics
Anchal Garg
Ang Mapahitas-on nga Mariposa
Ang Mapahitas-on nga Mariposa
Huni-huni
Chi squared test
Chi squared test
Ramakanth Gadepalli
Chi-Square Test of Independence
Chi-Square Test of Independence
Ken Plummer
Reporting Pearson Correlation Test of Independence in APA
Reporting Pearson Correlation Test of Independence in APA
Ken Plummer
MBA HR PROJECT REPORT ON TRAINING AND DEVELOPMENT
MBA HR PROJECT REPORT ON TRAINING AND DEVELOPMENT
Salim Palayi
Green Wave Briefs No. 1
Green Wave Briefs No. 1
Bahiyyah Maroon Ph.D.
LIFE - 10/21/09 - LSC-CyFair New Construction Update
LIFE - 10/21/09 - LSC-CyFair New Construction Update
LSC-CyFair Library, LIFE Workshops
World War II
World War II
Samantha Jarecki
выхадные в тбилиси
выхадные в тбилиси
Maia Odisharia
Revista qué pasa ranking UNIVERSIDADES 2010. (1de2)
Revista qué pasa ranking UNIVERSIDADES 2010. (1de2)
Roberto Manriquez
Northeast Ohio Business Community Impact
Northeast Ohio Business Community Impact
BVU
Datta Nadkarni portfolio 2014- Marketing Strategist for- Farmers, LensCrafter...
Datta Nadkarni portfolio 2014- Marketing Strategist for- Farmers, LensCrafter...
www.DATTANADKARNI.COM
Destaque
(20)
Chi square analysis
Chi square analysis
Chi square test
Chi square test
The chi square_test
The chi square_test
K wtest
K wtest
Chi square
Chi square
Aron chpt 11 ed (2)
Aron chpt 11 ed (2)
Reporting chi square goodness of fit test of independence in apa
Reporting chi square goodness of fit test of independence in apa
non parametric statistics
non parametric statistics
Ang Mapahitas-on nga Mariposa
Ang Mapahitas-on nga Mariposa
Chi squared test
Chi squared test
Chi-Square Test of Independence
Chi-Square Test of Independence
Reporting Pearson Correlation Test of Independence in APA
Reporting Pearson Correlation Test of Independence in APA
MBA HR PROJECT REPORT ON TRAINING AND DEVELOPMENT
MBA HR PROJECT REPORT ON TRAINING AND DEVELOPMENT
Green Wave Briefs No. 1
Green Wave Briefs No. 1
LIFE - 10/21/09 - LSC-CyFair New Construction Update
LIFE - 10/21/09 - LSC-CyFair New Construction Update
World War II
World War II
выхадные в тбилиси
выхадные в тбилиси
Revista qué pasa ranking UNIVERSIDADES 2010. (1de2)
Revista qué pasa ranking UNIVERSIDADES 2010. (1de2)
Northeast Ohio Business Community Impact
Northeast Ohio Business Community Impact
Datta Nadkarni portfolio 2014- Marketing Strategist for- Farmers, LensCrafter...
Datta Nadkarni portfolio 2014- Marketing Strategist for- Farmers, LensCrafter...
Semelhante a The Chi Square Test
Socratic Logic, Statistical Hypotheses And Significance Testing
Socratic Logic, Statistical Hypotheses And Significance Testing
Max Chipulu
2_Lecture 2_Confidence_Interval_3.pdf
2_Lecture 2_Confidence_Interval_3.pdf
CHANSreyya1
ch 9 Confidence interval.doc
ch 9 Confidence interval.doc
AbedurRahman5
Chap 3 - PrinciplesofInference-part1.pptx
Chap 3 - PrinciplesofInference-part1.pptx
arifmachinelearning
Week 15 PowerPoint
Week 15 PowerPoint
Michael Hill
Probability & Samples
Probability & Samples
Kaori Kubo Germano, PhD
statistical inference.pptx
statistical inference.pptx
suerie2
Hypothesis Testing: Statistical Laws and Confidence Intervals
Hypothesis Testing: Statistical Laws and Confidence Intervals
Matt Hansen
Theory of estimation
Theory of estimation
Tech_MX
Biostats Lec-2.pdf
Biostats Lec-2.pdf
PratikPhate2
Sampling Theory Part 1
Sampling Theory Part 1
FellowBuddy.com
Semelhante a The Chi Square Test
(11)
Socratic Logic, Statistical Hypotheses And Significance Testing
Socratic Logic, Statistical Hypotheses And Significance Testing
2_Lecture 2_Confidence_Interval_3.pdf
2_Lecture 2_Confidence_Interval_3.pdf
ch 9 Confidence interval.doc
ch 9 Confidence interval.doc
Chap 3 - PrinciplesofInference-part1.pptx
Chap 3 - PrinciplesofInference-part1.pptx
Week 15 PowerPoint
Week 15 PowerPoint
Probability & Samples
Probability & Samples
statistical inference.pptx
statistical inference.pptx
Hypothesis Testing: Statistical Laws and Confidence Intervals
Hypothesis Testing: Statistical Laws and Confidence Intervals
Theory of estimation
Theory of estimation
Biostats Lec-2.pdf
Biostats Lec-2.pdf
Sampling Theory Part 1
Sampling Theory Part 1
Último
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Dipal Arora
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
rwgiffor
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Service
ritikaroy0888
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Service
discovermytutordmt
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
Seo
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
Call Girls in Delhi
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
anilsa9823
Forklift Operations: Safety through Cartoons
Forklift Operations: Safety through Cartoons
Forklift Trucks in Minnesota
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
Renandantas16
Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...
Roland Driesen
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communications
karancommunications
Understanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key Insights
seri bangash
M.C Lodges -- Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
Aaiza Hassan
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
IlamathiKannappan
RSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors Data
Exhibitors Data
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptx
Workforce Group
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Dipal Arora
It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 May
NZSG
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
dlhescort
Último
(20)
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Service
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
Forklift Operations: Safety through Cartoons
Forklift Operations: Safety through Cartoons
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communications
Understanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key Insights
M.C Lodges -- Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
RSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors Data
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptx
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 May
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
The Chi Square Test
1.
Test of Significance:
The Chi-square Statistic 1 1 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton The Chi-square Statistic Learning Objectives To introduce the Chi-square statistic as a test of statistical significance To apply and interpret the calculated Chi-square statistic for a practical problem, using Chi-square tables and ‘degrees of freedom’. 2 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
2.
• “ When
it comes to number of babies, all months are equal but some months are more equal than others.” others.” 3 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton The Research Question = Research Hypothesis • It is often thought that there are some ‘boom’ months of their year when the number babies born is higher than others… • Can we, using data of babies who were born to hold a master’s degree, show this to be the case or not? • The research hypothesis is that there is a difference in the number of births from month to month. 4 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
3.
The Null Hypothesis
• If there is nothing to the myth of boom months, then the distribution of numbers of births would be uniform throughout all the months of the year • Therefore the null hypothesis: there is no difference in the number of births from month to month 5 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton Range of Actual Births What 'uniform' births Differences Numbe (observed would look like between rs frequencies) (expected expected and frequencies) observed Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 6 Total The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
4.
How well do
observed frequencies fit the uniform model? • There are differences between the expected and observed frequencies. But these differences could be just because of the randomness of the data • Intuitively, we know that small differences between the observed and predicted frequencies represent a ‘good’ fit • So, overall, if we sum the differences, then a small sum of differences represents a good model • But positive and negative differences may cancel out • This is not so good 7 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton How well do observed frequencies fit the uniform model? • So we square the differences between frequencies • Then we add squared differences up • A small sum of squares is good • To put the result into context, we divide each square difference by the respective expected frequency • The result is a measure of the goodness of our uniform random model 8 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
5.
(fo − fe
)2 χ2 = ∑ fe This is the Chi-Square Statistic 9 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton Range of Expected Differences between Difference squared Numb Frequencies expected and divided by ers observed expected (contribution to the chi-square) Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Total The Chi-square statistic: © 2009 Max Chipulu, University of Southampton 10
6.
Key Properties of
the Chi-Square • The Chi-Square is a non-parametric test: The value of the Chi-square statistics is not affected by the underlying statistical model that generates the data. • The value of the Chi-square depends only on the number of degrees of freedom, the higher the number of degrees of freedom, the higher the value of the chi- square should be. • The number of degrees of freedom is the number of different categories that contribute to the sum of the chi-square sum minus the number of pre-determined (or intermediate) parameters 11 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton ‘Degrees of freedom’ • In this example, degrees of freedom (d.f.) = k - 1, where k is the number of categories (months) that contribute to the chi-square. So d.f. = 12 – 1 = 11 • Suppose that instead of using months we use seasons as our categories. Then we would only have four categories that would contribute to the chi-square. As such, would expect a SMALLER chi- square because there is a smaller number of contributions to the chi- square. • But why subtract by 1? Well the total number of births for all months is a predetermined value: it depends only on the sample size. If we know the frequencies for 11 of the 12 months, and we know the total number of births, then we can work out from these two numbers, what were the number of births in the 12th month. So therefore, although we have in total 12 months (12 categories), there in fact only 11 ways (degrees of freedom) that the chi-square value can vary. 12 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
7.
Is the chi-square
significant? υ 50 40 30 25 20 15 10 5 3 5 4.351 5.132 6.064 6.626 7.289 8.115 9.236 11.07 12.83 6 5.348 6.211 7.231 7.841 8.558 9.446 10.64 12.59 14.45 7 6.346 7.283 8.383 9.037 9.803 10.75 12.02 14.07 16.01 8 7.344 8.351 9.524 10.22 11.03 12.03 13.36 15.51 17.53 9 8.343 9.414 10.66 11.39 12.24 13.29 14.68 16.92 19.02 10 9.342 10.47 11.78 12.55 13.44 14.53 15.99 18.31 20.48 11 10.34 11.53 12.90 13.70 14.63 15.77 17.28 19.68 21.92 12 11.34 12.58 14.01 14.85 15.81 16.99 18.55 21.03 23.34 13 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton Example: Goals in football • Hypothesis: the total number of goals scored in a game of football in Europe follows a Poisson Distribution with mean 2.73 14 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
8.
In Europe we
observed this distribution of football goals 20 15 Matches 10 5 0 0 1 2 3 4 5 6 7 8 Goals Scored Now, that we know about some distributions, it might look vaguely familiar 15 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton A Poisson Model of Goals in Football We can think about football like this Let each minute that we watch a game be an experiment The experiment is: is there a goal or not? It is a success if there is a goal; it is a failure if there is not. Since only 3 goals are expected after 90 mins, the probability of ‘success’ is very small. In each minute we conduct a Bernoulli trial. There are 90 trials. It seems reasonable to model goal scoring in Football as a Poisson Process 16 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
9.
A Poisson Model
of Goals in Football Alternatively, we can think about football like this A game of football is played over 90 minutes. This is a constrained time interval The number of goals scored in each game is a discrete random variable. Suppose we divide the match into very small intervals, e.g. minutes, then within each small interval, it is reasonable to assume that 1. There will be at most only one goal scored; 2. The probability of observing a goal is proportional to the length of that interval of time, e.g. the probability of observing a goal in 1 minute is twice that of a goal in 30 seconds The above are the key characteristics of a Poisson process 17 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton Poisson Probabilities of Football Goals From our data, the expected number of goals per game is 2.73 And so, P(zero goals) = e-µ = e-2.73 = 0.0652 P(1 goal) = 2.73* 0.0652 = 0.1780; e−µ P(2 goals) = 2.73/2* 0.1780 = 0.2430; P(3 goals) = 2.73/3* 0.2430 = 0.2212; P(4 goals) = 2.73/4* 0.2212 = 0.1509; P(5 goals) = 2.73/5* 0.1509 = 0.0824; P(6 goals) = 2.73/6* 0.0824 = 0.0375; P(7 goals) = 2.73/7* 0.0375 = 0.0146; P(8 goals) = 2.73/8* 0.0146 = 0.0050; 18 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
10.
Expected Frequencies of
Matches According to Poisson According to the Poisson Model, the probability of that a football match will end with zero goals is 0.0652. If we watch 66 matches in total, how many of them should we expect to end with zero goals? Number of games with total zero goals = 0.0652*66 = 4.3 We can thus work out all the expected frequencies of matches with i goals by multiplying the Poisson probabilities with the total number of matches seen 19 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton Table 11.1 Comparing Goals Predicted with Observed Goals Goals at Poisson Probability Number of end of of Seeing this Games Expeced Match Number of Goals to end with this 0 0.0652 4.3 1 0.1780 11.8 2 0.2430 16.0 3 0.2212 14.6 4 0.1509 10.0 5 0.0824 5.4 6 0.0375 2.5 7 0.0146 1.0 8 or more 0.0070 0.5 20 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
11.
Comparison of Expected
and Observed Frequencies of Matches with i goals We also have the observed frequencies So if scoring goals in football is really a Poisson process, then there should not be much difference between the expected and observed frequencies Any difference between predicted and actual should be small and due to random variation only 21 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton Football: Observed Vs Poisson Frequencies 20 15 Frequency 10 5 0 0 1 2 3 4 5 6 7 8 Predicted Observed Goals 22 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
12.
Calculate the contribution
each fi to the χ2; Find the sum: χ2 Goals Expected, Observed, Contribution fe fi s to the χ2 0 or 1 16.1 15 0.0694 2 16.0 20 0.9774 3 14.6 11 0.8863 χ2 4 10.0 9 0.0930 5 or More 9.2 11 0.3483 Total 66.0 66 2.3744 Degrees of Freedom = k – number of predetermined parameters =5–2=3 23 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton To test significance answer this question • Is the calculated chi-square value so high that it is unusual to observe such a value or higher values with 3 degrees of freedom? • Alternatively: Is the probability of observing a chi-square value of 2.37 or more with three degrees of freedom small (say 5% or less)? 24 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
13.
To find the
highest χ2 Value observed 95% of the time, under 3 df, if H0 is true Percentage Points of the Chi-Square Distribution υ 50 20 15 10 5 3 1 1 0.45 1.64 2.07 2.71 3.84 5.02 6.63 2 1.39 3.22 3.79 4.61 5.99 7.38 9.21 3 2.37 4.64 5.32 6.25 7.81 9.35 11.34 4 3.36 5.99 6.74 7.78 9.49 11.14 13.28 5 4.35 7.29 8.12 9.24 11.07 12.83 15.09 25 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton To find the highest χ2 Value observed 95% of the time, under 3 df, if H0 is true Percentage Points of the Chi-Square Distribution υ 50 20 15 10 5 3 1 1 0.45 1.64 2.07 2.71 3.84 5.02 6.63 2 1.39 3.22 3.79 4.61 5.99 7.38 9.21 3 2.37 4.64 5.32 6.25 7.81 9.35 11.34 4 3.36 5.99 6.74 7.78 9.49 11.14 13.28 5 4.35 7.29 8.12 9.24 11.07 12.83 15.09 26 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
14.
Incidence of Disease
Among Adults A county council is worried about the number of Example adults who suffer from a particular disease and has collected the following information AGE GROUP SICK HEALTHY TOTAL 34-39 1327 15702 17029 40-44 2072 17454 19524 45-49 2456 14237 16693 Contingency 50-54 3611 11519 15130 Table 55-59 4688 9174 13862 Analysis 60-64 5490 7526 13016 27 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton Incidence of Disease Among Adults Example Can it be said that all age groups are equally likely to be affected and that the differences may be due to random variation? Or, are some age groups more susceptible than others to acquiring the disease? Contingency Table Analysis 28 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
15.
Incidence of Disease
Among Adults What will be the numbers in each of these cells be Example in a perfect world, i.e. in world where advancing age did not mean more disease? AGE GROUP SICK HEALTHY TOTAL 34-39 17029 40-44 19524 45-49 16693 Contingency 50-54 15130 Table 55-59 13862 Analysis 60-64 13016 29 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton Step1: Hypothesize Example Assume that age is NOT related to the incidence of the disease, i.e the maintained hypothesis, H0 is that the incidence of the disease is independent of age. And the alternative hypothesis, Ha is that age IS related to the incidence of the disease Contingency Table Analysis 30 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
16.
Step2: Create the
statistical model such that age is independent of the incidence Example of the disease. From the rules of probability; the model is as follows: Let the event that an adult is aged 34-39 be A. Let the event that an adult is sick be S. Then, if incidence of the disease is independent Contingency of the age, the probability that an adult is aged Table between 34-39 AND sick is given by the simplified 34- Analysis multiplication rule: P(A and S) = P(A)*P(S) 31 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton Step3: Apply the simplified multiplication rule to calculate the probability of every Example combination of age range and sick; and age range and healthy AGE GROUP SICK HEALTHY TOTAL 34-39 0.037 0.142 0.179 40-44 0.042 0.163 0.205 45-49 0.036 0.139 0.175 Contingency 50-54 0.033 0.126 0.159 Table 55-59 0.030 0.116 0.146 Analysis 60-64 0.028 0.108 0.137 Total 0.206 0.794 1.000 32 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
17.
Step4: Given these
probabilities, calculate the expected number of adults of each age Example group expected to be sick, and to be healthy AGE GROUP SICK HEALTHY TOTAL 34-39 3512 13518 17029 40-44 4026 15498 19524 45-49 3443 13251 16693 50-54 3120 12010 15130 Contingency Table 55-59 2859 11004 13862 Analysis 60-64 2684 10332 13016 Total 19644 75612 95254 33 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton Step5: Now test the hypothesis: How well does our independence model predict numbers of Example adults of a certain age who will be sick and who will be healthy? Use Chi-Square to compare differences between observed and expected frequencies. Proceed by calculating the contribution of Contingency each combination of age and sick and age Table and healthy to the chi-square value and Analysis summing them up. 34 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
18.
Step 5 Cont’d:
The Chi-Square value is the sum of all the contributions. It is 8531. hmm! Example What is the probability of observing a χ2 value this large or larger when the independence model holds? AGE GROUP SICK HEALTHY TOTAL 34-39 1359 353 1712 40-44 949 247 1196 45-49 283 73 356 Contingency 50-54 77 20 97 Table 55-59 1171 304 1475 Analysis 60-64 2933 762 3695 Total 6771 1760 8531 35 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton Step6: Calculate Number of degrees of freedom of Chi-Square value Example When expected values are calculated, the expected values in the last column and last row can be filled in automatically. This is because the total number of adults, e.g. the total number of adults aged 34-39, for each column and row is fixed and known already. Contingency Hence, the values is the last columns and rows Table are not free and the total number of degrees of Analysis freedom is (number of rows minus one)*(number of columns minus one) = (6-1) * (2-1) = 5 36 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
19.
For five d.f.
the tables we have created do not list values of the χ2 as high as 8531. All we can say is that the probability of values of the χ2 of 8531 or higher must be very very small. Alternatively, we can look at the the maximum value of the χ2 that is observed 95% of the time for five d.f. This is 11.071. Since, 8531 is way beyond this, we must reject the maintained hypothesis. Incidence of the disease is not independent of the age. 37 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton dfare a 25 10 5 2.5 1 0.5 1 1.3233 2.7055 3.8415 5.0239 6.6349 7.8794 2 2.7726 4.6052 5.9915 7.3778 9.2103 10.597 3 4.1083 6.2514 7.8147 9.3484 11.345 12.838 4 5.3853 7.7794 9.4877 11.143 13.277 14.86 5 6.6257 9.2364 11.071 12.833 15.086 16.75 38 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
20.
Example from SPSS
Practical: What are the key factors in the value of an MBA program THE VARIABLES Salary: Average Salary of MBA graduates Fees: Program Fees at the school Age: Average age of an MBA candidate GMAT: Average academic aptitude Intake: Number of candidates on program Experience: Average experience (yrs) of candidates Country: Whether country is USA (1) or another (0) 39 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton Example from SPSS Practical: Is ‘salary’ related to ‘country’? Chi-square test in SPSS Open SPSS 17 Import the ‘MBA.xls’ data to SPSS as explained in the SPSS handout. We wish to conduct a chi-square cross-tabulation (i.e. contingency table) test on ‘salary’ by ‘country’ Null Hypothesis: ‘salary’ and ‘country’ are independent Alternative hypothesis: ‘salary’ and ‘country’ are not independent, i.e. they are related. 40 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
21.
Example from SPSS
Practical: Re-coding ‘salary’ The salary variable is not categorical, i.e. it is quantitative and not strictly suitable for cross-tabulation So, first, recode salary into categories: 1. Go to the ‘transform’ menu 2. Choose ‘visual binning’ 3. Select ‘salary’ as the variable to bin 4. You should see a histogram of the ‘salary’ variable 5. Type a new name for new variable that will be created after ‘re-coding’ in the box labelled ‘binned variable’. I have called my new variable ‘salary_codes’. 6. Select the tab ‘make cutpoints’. There are several options for cutpoints: a good one is to divide the data by ‘equal percentiles’. For example, if you input ‘3’ in this box, the salary data will be re-coded with 3 cutpoints so that there will be four sections of the data- the first 25% values will be re-coded as ‘1’, values in the next 25% group (i.e. 25% to 50%) will be recoded as ‘2’ and so on.. 7. Click ‘ok’ 8. Check that a new ordinal variable representing categories of salary has been formed. 41 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton Example from SPSS Practical: Cross-tabulation of ‘salary_code’ by ‘country’ Now to conduct a chi-square test: 1. Go to the ‘analyse’ menu. 2. Choose ‘descriptive statistics’, ‘crosstabs…’ 3. Input ‘salary_codes’ into the ‘row’ box and ‘country’ into the ‘column’ box. 4. Click the ‘statistics’ tab and check the ‘chi-square’ box 5. Click ok 6. What do the results suggest??? 42 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
22.
3- Way Chi-square
test The results above suggest that the ‘salary’ IS related to the ‘country’. Suppose that we think that this relationship is somewhat affected by the GMAT of the students, we can test this by creating a three-way cross-tabulation: 1. Re-code the GMAT variable into ‘GMAT_codes’, say two categories of ‘low’ and ‘high’ using the ‘visual binning’ in the ‘transform’ menu 2. Repeat the chi-square test of ‘salary_code’ by ‘country’. However, this time, enter the ‘GMAT_code’ variable in the ‘layer’ box. 3. Run the model. 43 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton 3- Way Chi-square test Result 44 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
23.
3- Way Chi-square
test Result The three-way chi-square test, suggests that once we take into account the GMAT average of the students, there is no relationship between ‘salary’ and ‘country’: We can therefore conclude that the observed relationship between ‘salary’ and ‘country’ is in fact indirectly caused by the ‘GMAT’ variation…. 45 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton Example: Was the Class lottery Conducted According to the Rules? In order to sample from the distribution of inter-arrival time at a checkout of a super market, we played a lottery. The results ( shown overleaf) show that the simulated distribution looks very much like the distribution from which we are sampling. But they are not the same. So what are we looking at? Are we looking at two data set generated by the same distribution so that the differences can be attributed to random variation? Or are we, in fact, looking at two datasets not of the same distribution so that the differences are not random such as would be the case if the lottery were not conducted properly? 46 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
24.
Inter-arrival Distribution from
Lottery Inter-arrival Inter-arrival Observed Expected Time Probability Frequency Frequency 1 0.39 76 71 2 0.17 43 31 3 0.13 18 24 4 0.09 13 16 5 0.06 11 11 6 0.05 5 9 7 0.03 6 5 8 0.02 2 4 9 0.06 9 11 TOTAL 183 183 47 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton Analysis • H0: The observed data from the lottery was generated by the same process as the original inter-arrival time distribution • Using original probabilities calculate expected frequencies for each inter-arrival time, out of the total of 183 • Combine the category of the inter-arrival time categories of 7 and 8 mins, since the expected frequency of 8 mins is small (< 5) 48 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
25.
Analysis Cont’d
• Calculate the χ2. This is 9.37 • The d.f. is k – 1, where k is the number of categories of inter-arrival time, which is 8. So d.f. = 7 • For d.f. = 7, the probability of a value of χ2 = 9.37 or higher is between 20% and 25%. This is not small. • Decision: we cannot reject H0 • The lottery was conducted according to rules 49 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton Further Reading • Alan Agresti, 1996. ‘Introduction to Categorical Data Analysis’. John Wiley and Sons, London. 50 The Chi-square statistic: © 2009 Max Chipulu, University of Southampton
Baixar agora