SlideShare uma empresa Scribd logo
1 de 35
Baixar para ler offline
1
OBJECTIVES
• Recognise a suitable distribution to apply chi
  square test to
• Conduct the goodness-of-fit test of hpothesis
• Conduct the test of independence
• Conduct a test of homgeneity



                                                   2
Chi square distribution
• Positively skewed
• Test done on right tail only
• Therefore all chi square tests are positive with
  one critical value only
• Basic steps of hypothesis test are the same, only
  the test statistic and distribution have changed


                                                  3
• Techniques used to analyse data up to now was
  measured on quantitative scale.
• Results of tests can often be classified into categories
  where there is no natural order:
   – Categorical variable
   – Categories
   – Categorical data
• Categorical data can be analysed with Chi-squared
  tests:
   – Simple random sample
   – Sample size reasonable large
                                                             4
Example:
• Survey of job satisfaction
• Employed persons classified as satisfied, neutral,
  dissatisfied
CATEGORICAL VARIABLE – is employee satisfaction

CATEGORIES – satisfied, neutral, dissatisfied

CATEGORICAL DATA – no. of employees satisfied,
neutral or dissatisfied (also referred to as frequency of
category)
                                                        5
Examples
1. A persons income can be categorised as high,
   medium or low. Define the categorical variable,
   the categories and the categorical data
2. We want to investigate different types of
   industries, e.g. information technology,
   financial and transformation. Define the
   categorical variable, the categories and the
   categorical data

                                                 6
Example answers
1. Categorical variable is income. Categories are
   high, medium and low. Categorical data are the
   no. of people who have high, medium or low
   income
2. Categorical variable is type of industry,
   categories are information technology, financial
   and transformation. Categorical data are the no
   of industries that are information tech, financial
   or transformation
                                                    7
• Chi-squared goodness-of-fit test
  – This test describes a single population of categorical data.
  – The multinomial experiment studied is an extension of the
    binomial experiment.
     • There are n independent trials.
     • The outcome of each trial can be classified into one of
       k categories.
     • The probability pi of cell i remains constant for each
       trial. Moreover, p1 + p2 + … +pk = 1.
  – Experiment records the observed trails for each category.
  – Denoted by f1, f2, …, fk and f1 + f2 + … + fk = n
                                                            8
EXAMPLE
In a box of smarties you will find 6 different colours:
brown, red,yellow,blue,orange and green. A
random sample of smarties (6918 in total) was
taken and the frequesncy of each colour was
counted. The distribution of colours is given below
     Colour   Brown   Red    Yellow   Blue   Orange Green
        f     1611    1172   1308     904    921    1002


Determine whether the smarties survey fits the
description of a multinomial experiment

                                                            9
EXAMPLE
Answer:
See example 10.1, p350, textbook




                                   10
To use the Χ2-tests
•   The goodness-of-fit test                      all expected
                                               frequencies must
     – Used to determine if the observed counts ofbe at least 5
                                                   the
      categories agree with the probabilities specified for
      each category.
    – Observed frequencies (f ) compared with the expected
      frequencies (e).

    Testing H0: Proportions agree with specified probabilities

     Alternative             Decision rule:
                                                 Test statistic
     hypothesis              Reject H0 if …
                                                     ( fi  ei )2
    H1: H0 is not true   Χ2 > Χ2k – 1;1 – α   2  
                                                          ei 11
• Example
      – A household detergent is marketed in three sizes:
          • 1 000 ml, 750 ml and 250 ml
      – The distributers belief that the market share of the different sizes is
        as follow:
          • 1 000 ml = 40%
          • 750 ml = 45%
          • 250 ml = 15%.
      – To study the effect of the economic climate on the sales of the
        products, 200 customers were ask to state which size they will
        prefer.
 •   Survey results:
      – 82 customers preferred the 1 000 ml
      – 102 customers preferred the 750 ml
      – 16 customers preferred the 250 ml
                                                                                  12
• Solution
  –   The population investigated is the size preferences.
  –   The data are in categories.
  –   This is a multinomial experiment (three categories).
  –   The question of interest: Are p1, p2, and p3 different
      from the expected 40%, 45% and 15%?




                                                               13
• The hypotheses are:                                                    Expected
                                                                            frequencies are
         – H0: p1 = 0,40, p2 = 0,45, p3 = 0,15                                   all ≥ 5

         – H1: At least one pi is not equal to its specified value.
  Are the observed and the expected frequencies the same?
                                                                             Expected
       Expected frequencies                   Observed values               frequencies
                                             16
  15%
                                                                              ei = npi
                        40%   1000ml                        82   1000ml   40% of 200 = 80
                              750ml                              750ml
                              250ml                              250ml
                                                                          45% of 200 = 90
                                       102
45%                                                                       15% of 200 = 30
                                                                                   14
• The hypotheses are:
         – H0: p1 = 0,40, p2 = 0,45, p3 = 0,15
         – H1: At least one pi is not equal to its specified value.
  Are the observed and the expected frequencies the same?
                                                                              Expected
                                                                             frequencies
       Expected frequencies                    Observed values

   30
  15%
                                              16                               ei = npi
                                                                           40% of 200 = 80
                          80
                        40%    1000ml                        82   1000ml
                               750ml                              750ml    45% of 200 = 90
                               250ml                              250ml
                                                                           15% of 200 = 30
                                        102
90
45%                                                                                  15
• The hypotheses are:
  – H0: p1 = 0,40, p2 = 0,45, p3 = 0,15
  – H1: At least one pi is not equal to its specified value.
           ( fi  ei ) 2
    2                                                                 α = 0,05
                ei
                                                             0       5,9917
                                                                     Χ2k – 1;1 – α
          (82  80) (102  90) (16  30)
                     2                   2              2
                                 
              80          90            30           Accept H0 Reject H0

         8,18     Conclusion: At 5% significance level there is
  – Reject H0.           sufficient evidence to reject the null hypothesis.
                         At least one of the probabilities pi is different.
                         Thus, at least two market shares have changed.     16
Two friends were playing a board game in which a die played a
big role. One of the players believed that the die was not fair.
60 tosses of the die produced the results below. Test at 5%
significance level whether the die was fair.


Number of dots     1       2       3       4        5       6
Number of tosses   7       6       7       18       15      7




                                                            17
ei = npi
                 = 60(1/6)
                 = 10
               Expected values for the six categories are: 10   10   10   10   10   10
               H0: p1 = … = p6 = 1/6
               H1: At least one pi ≠ 1/6
                = 0,05
                              f  e
                                       2
                =
                2
                                 e
                      7 10 + … +
                                  2
                                           (7 10) 2
                    =
                             10               10
                  = 13,2                                            Accept H0 Reject H0
                 2
                k 1;1 
                  2
                                      
               =  5; 0,95
               = 11,07

               Therefore, reject H0. The probabilities of the dots are not equal and the die was not
     
               fair.
                                                                                             18
• Chi-squared test for independence
  – Cross classify two categories using a contingency table.
  – Rows representing one category and columns
    representing the other category.
  – Each value in cell indicates the frequency in the cross
    classification.
  – Table can be any number of rows and columns:
     • r×c number of cells


                                                          19
CONCEPT QUESTIONS
• Questions 1 – 3 , p356




                            20
• Chi-squared test for independence
   – H0: the two variables are independent – no relationship.
   – H1: the two variables are dependent – is a relationship.

 For a 2×2 contingency table:
               B
 A       B1        B2    Total
 A1      f11       f12    r1
                                           Observed
 A2      f21       f22    r2              frequencies
Total    c1        c2     n

                                                          21
• Chi-squared test for independence
      – Contingency tables describe the relationship between two
        categorical variables.
      – H0: the two variables are independent – no relationship.
      – H1: the two variables are dependent – is a relationship.
For a 2×2 contingency table:                 For each observed
                 B
                                          frequency an expected
                                             frequency must be
 A         B1        B2    Total                 calculated
 A1        f11       f12    r1
 A2        f21       f22    r2        row total × column total
                                   e=
Total      c1        c2     n                     n        22
• Chi-squared test for independence
        – Contingency tables describe the relationship between two
          categorical variables.
        – H0: the two variables are independent – no relationship.
        – H1: the two variables are dependent – is a relationship.
 For a 2×2 contingency tabel:
                 B
                                        row total × column total
 A         B1        B2    Total   e=
                                                        n
 A1        f11       f12    r1
 A2        f21       f22    r2
                                   e11  (r1  c1 ) / n ; e12  (r1  c2 ) / n
Total      c1        c2     n      e21  (r2  c1 ) / n ; e22  (r2 23 2 ) / n
                                                                     c
• Chi-squared test for independence
   – H0: the two variables are independent – no relationship.
   – H1: the two variables are dependent – is a relationship.


            Testing H0: Variables are independent

  Alternative           Decision rule:
                                                       Test statistic
  hypothesis            Reject H0 if …

                                                       ( f  e)         2
                                                   
H1: Variables are
                    Χ2 > Χ2(r – 1)(c – 1);1 – α    2
   dependent
                                                           e
                                                                        24
• Example
 – A household detergent is marketed in three sizes:
    • 1 000 ml, 750 ml and 250 ml
 – The market for potential buyers is divided into three
   age groups:
    • < 30 years old
    • 30–50 years old
    • > 50 years old
 – Market researcher believe that there is a relationship
   between the age of a buyer and the size of the
   packaging.                                               25
• Solution
  – The data is summarised in a 3×3 contingency table.
  – H0: Size and age are independent.
                                           Observed
  – H1: Size and age are dependent.       frequencies

                    Age groups
  Size       < 30     30–50        > 50        Total
1 000 ml     27          41         14          82
 750 ml      39          18         45         102
 250 ml       8           2          6          16
 Total       74          61         65         200     26
• Solution
  – Calculate the expected frequency
  – (Row total×column total)/n          Expected frequency:
                                       (74×82)/200 = 30,34

                     Age groups
  Size       < 30      30–50      > 50          Total
1 000 ml   27 30,34 41 25,01 14 26,65            82
 750 ml    39 37,74 18 31,11 45 33,15            102
 250 ml    8    5,92 2     4,88 6    5,20         16
  Total       74         61        65            200
                                                        27
Χ2(r – 1)(c – 1);1 – α =
                                                              Χ2(3-1)(3-1);0.95 = 9.49
• The hypotheses are:
  – H0: Size and age are independent
                                                                         α = 0,05
  – H1: Size and age are dependent
                                                          0          9,49
         ( f  e)  2
    
    2

             e
                                                           Accept H0      Reject H0
       (27  30,34) 2 (41  25, 01) 2           (6  5, 20) 2
                                     ..... 
           30,34          25, 01                   5, 20
      28,95
  – Reject H0.         Conclusion: At 5% significance level there is
                       sufficient evidence to reject the null hypothesis.
                       There is a relationship between the size of detergent
                       that people prefer and their age.                  28
A recent survey of marketing managers in four different industries provided
 the data in the table below, which gives managers attitudes to market
 research and its value in marketing decision making:-
                                                INDUSTRY TYPE
Perceived value        Consumer            Industrial          Retail &        Finance &
of M Research          businesses        organisations        wholesale        insurance

Little value      9                 22                   13               9
Moderate value    29                41                   6                17
Great value       26                28                   6                27
TOTAL             64                91                   25               53


 Test at 1% level of significance whether manager’s perception of the value
 of the market research is dependent on the type of industry in which a
 marketing manager is employed.                                         29
Industry type

     Perceived value of market                 Consumer     Industrial  Retail and Finance Total
                                                                                     and
     research                                  businesses organisations wholesale insurance

     Little value                               9 (14,56)   22 (20,7)      13 (5,69)    9 (12,06)    53
     Moderate value                            29 (25,55)   41 (36,32)     6 (9,98)     17 (21,15)   93
     Great value                                26 (23,9)   28 (33,98)     6 (9,33)     27 (19,79)   87
     Total                                            64       91              25          53        233
               H0: Manager’s perception is independent of industry type.
               H1: Manager’s perception is dependent of industry type.
                = 0,01
                            f  e
                                    2
                =
                2
                            e
                      9 14,56 + … + 27 19,79
                                2                 2
                    =
                         14,56             19,79
                  = 20,895
                2
               r 1c 1;1 
                                                                                    Accept H0 Reject H0
           =  2                      
                 6; 0,99
               = 16,81
             Therefore, reject H0. Manager’s perception is dependent on the industry type.         30
     
Questions 4 – 6, p361, textbook




                                  31
• Chi-squared Test of Homogeneity
  – Test if two or more populations are homogeneous (similar)
    with regard to a certain characteristic.
  – H0: The proportion of elements with certain characteristic in
    two or more different populations are the same.
  – H1: The proportion of elements with certain characteristic in
    two or more different populations are not the same.
  – The rest of the test is the same as the test for
    independence.

                                                            32
An immigration attorney was investigating which industries to
target for obtaining new clients who might have problems with
change in the immigration laws. The lawyer selected five
industries and twenty workers were randomly selected in each
industry and their visa statuses were verified.
          VISA STATUS               INDUSTRY
                          A   B       C        D      E
       Illegal resident   8   10      5        10     1
       Legal resident     4   2       6        4      9
       SA citizen         8   8       9        6      10

Test at a 1% level of significance whether the 5 industries are
homogeneous with respect to the visa status of their workers
                                                           33
Industry
          Visa status                                                                           Total
                                           A       B        C            D            E
 Illegal resident 8 (6,8) 10 (6,8) 5 (6,8) 10 (6,8) 1 (6,8)                                      34
 Legal resident 4 (5)                             2 (5)    6 (5)      4 (5)         9 (5)        25
 SA citizen                              8 (8,2) 8 (8,2) 9 (8,2) 6 (8,2) 10 (8,2) 41
 Total                                    20       20       20          20           20         100
               H0: Five industries are homogeneous with respect to the visa status of their
               workers.
               H1: Five industries are heterogeneous with respect to the visa status of their
               workers.
                = 0,01
                            f  e
                                    2
               2 =        e
                    8  6,8 + … + 10  8,2
                              2               2
                  =
                        6,8             8,2
                = 15,32
                2
               r 1c 1;1 
           =  2                  
                 8; 0,99
               = 20,09                                                                                  34
             Therefore, do not reject H0. The five industries are homogeneous with respect to
             the visa status of their workers.
CLASSWORK/HOMEWORK
1. Activity 1,2,3,4 – p168 – 174, Module
   Manual
2. Revision exercise 1, 2, 3 4 – p174 -176,
   Module Manual
3. Self Review Test – 1 – 4, p 368, textbook
4. Supplementary Exercises 1 – 11, p370,
   textbook


                                               35

Mais conteúdo relacionado

Mais procurados

MTH120_Chapter9
MTH120_Chapter9MTH120_Chapter9
MTH120_Chapter9Sida Say
 
C2 st lecture 11 the t-test handout
C2 st lecture 11   the t-test handoutC2 st lecture 11   the t-test handout
C2 st lecture 11 the t-test handoutfatima d
 
law of large number and central limit theorem
 law of large number and central limit theorem law of large number and central limit theorem
law of large number and central limit theoremlovemucheca
 
Probability Distributions for Continuous Variables
Probability Distributions for Continuous VariablesProbability Distributions for Continuous Variables
Probability Distributions for Continuous Variablesgetyourcheaton
 
Statistical inference: Probability and Distribution
Statistical inference: Probability and DistributionStatistical inference: Probability and Distribution
Statistical inference: Probability and DistributionEugene Yan Ziyou
 
Hypothesis Testing-Z-Test
Hypothesis Testing-Z-TestHypothesis Testing-Z-Test
Hypothesis Testing-Z-TestRoger Binschus
 
Probability Distribution & Modelling
Probability Distribution & ModellingProbability Distribution & Modelling
Probability Distribution & ModellingNakshita1704
 
Introduction To Data Science Using R
Introduction To Data Science Using RIntroduction To Data Science Using R
Introduction To Data Science Using RANURAG SINGH
 
Week7 Quiz Help 2009[1]
Week7 Quiz Help 2009[1]Week7 Quiz Help 2009[1]
Week7 Quiz Help 2009[1]Brent Heard
 
Tbs910 sampling hypothesis regression
Tbs910 sampling hypothesis regressionTbs910 sampling hypothesis regression
Tbs910 sampling hypothesis regressionStephen Ong
 

Mais procurados (19)

MTH120_Chapter9
MTH120_Chapter9MTH120_Chapter9
MTH120_Chapter9
 
Goodness of fit test
Goodness of fit testGoodness of fit test
Goodness of fit test
 
C2 st lecture 11 the t-test handout
C2 st lecture 11   the t-test handoutC2 st lecture 11   the t-test handout
C2 st lecture 11 the t-test handout
 
Chap09 hypothesis testing
Chap09 hypothesis testingChap09 hypothesis testing
Chap09 hypothesis testing
 
law of large number and central limit theorem
 law of large number and central limit theorem law of large number and central limit theorem
law of large number and central limit theorem
 
Les5e ppt 06
Les5e ppt 06Les5e ppt 06
Les5e ppt 06
 
Les5e ppt 05
Les5e ppt 05Les5e ppt 05
Les5e ppt 05
 
Les5e ppt 04
Les5e ppt 04Les5e ppt 04
Les5e ppt 04
 
6. point and interval estimation
6. point and interval estimation6. point and interval estimation
6. point and interval estimation
 
Probability Distributions for Continuous Variables
Probability Distributions for Continuous VariablesProbability Distributions for Continuous Variables
Probability Distributions for Continuous Variables
 
Statistical inference: Probability and Distribution
Statistical inference: Probability and DistributionStatistical inference: Probability and Distribution
Statistical inference: Probability and Distribution
 
Probability distributions & expected values
Probability distributions & expected valuesProbability distributions & expected values
Probability distributions & expected values
 
Msb12e ppt ch06
Msb12e ppt ch06Msb12e ppt ch06
Msb12e ppt ch06
 
Hypothesis Testing-Z-Test
Hypothesis Testing-Z-TestHypothesis Testing-Z-Test
Hypothesis Testing-Z-Test
 
Probability Distribution & Modelling
Probability Distribution & ModellingProbability Distribution & Modelling
Probability Distribution & Modelling
 
Introduction To Data Science Using R
Introduction To Data Science Using RIntroduction To Data Science Using R
Introduction To Data Science Using R
 
Testing of hypothesis
Testing of hypothesisTesting of hypothesis
Testing of hypothesis
 
Week7 Quiz Help 2009[1]
Week7 Quiz Help 2009[1]Week7 Quiz Help 2009[1]
Week7 Quiz Help 2009[1]
 
Tbs910 sampling hypothesis regression
Tbs910 sampling hypothesis regressionTbs910 sampling hypothesis regression
Tbs910 sampling hypothesis regression
 

Destaque

Destaque (8)

Chi square test
Chi square test Chi square test
Chi square test
 
Chi square test
Chi square testChi square test
Chi square test
 
Chi square2012
Chi square2012Chi square2012
Chi square2012
 
Chi sequare
Chi sequareChi sequare
Chi sequare
 
Presentation week 8
Presentation week 8Presentation week 8
Presentation week 8
 
Chi square[1]
Chi square[1]Chi square[1]
Chi square[1]
 
Chi square test final
Chi square test finalChi square test final
Chi square test final
 
Chi Square Worked Example
Chi Square Worked ExampleChi Square Worked Example
Chi Square Worked Example
 

Semelhante a Statistics lecture 10(ch10)

P value in proportion
P value in proportionP value in proportion
P value in proportionNadeem Uddin
 
Tests of significance
Tests of significanceTests of significance
Tests of significanceTanay Tandon
 
Hypothesis testing1
Hypothesis testing1Hypothesis testing1
Hypothesis testing1HanaaBayomy
 
Intro to tests of significance qualitative
Intro to tests of significance qualitativeIntro to tests of significance qualitative
Intro to tests of significance qualitativePandurangi Raghavendra
 
8. testing of hypothesis for variable &amp; attribute data
8. testing of hypothesis for variable &amp; attribute  data8. testing of hypothesis for variable &amp; attribute  data
8. testing of hypothesis for variable &amp; attribute dataHakeem-Ur- Rehman
 
Statr session 17 and 18 (ASTR)
Statr session 17 and 18 (ASTR)Statr session 17 and 18 (ASTR)
Statr session 17 and 18 (ASTR)Ruru Chowdhury
 
Statr session 17 and 18
Statr session 17 and 18Statr session 17 and 18
Statr session 17 and 18Ruru Chowdhury
 
Topic 7 stat inference
Topic 7 stat inferenceTopic 7 stat inference
Topic 7 stat inferenceSizwan Ahammed
 
Malimu statistical significance testing.
Malimu statistical significance testing.Malimu statistical significance testing.
Malimu statistical significance testing.Miharbi Ignasm
 
Midterm 3 review
Midterm 3 reviewMidterm 3 review
Midterm 3 reviewdrahkos1
 
Test of-significance : Z test , Chi square test
Test of-significance : Z test , Chi square testTest of-significance : Z test , Chi square test
Test of-significance : Z test , Chi square testdr.balan shaikh
 
08 test of hypothesis large sample.ppt
08 test of hypothesis large sample.ppt08 test of hypothesis large sample.ppt
08 test of hypothesis large sample.pptPooja Sakhla
 

Semelhante a Statistics lecture 10(ch10) (20)

Data analysis
Data analysisData analysis
Data analysis
 
Session 14
 Session 14 Session 14
Session 14
 
P value in proportion
P value in proportionP value in proportion
P value in proportion
 
Tests of significance
Tests of significanceTests of significance
Tests of significance
 
Hypothesis testing1
Hypothesis testing1Hypothesis testing1
Hypothesis testing1
 
Intro to tests of significance qualitative
Intro to tests of significance qualitativeIntro to tests of significance qualitative
Intro to tests of significance qualitative
 
8. testing of hypothesis for variable &amp; attribute data
8. testing of hypothesis for variable &amp; attribute  data8. testing of hypothesis for variable &amp; attribute  data
8. testing of hypothesis for variable &amp; attribute data
 
Statr session 17 and 18 (ASTR)
Statr session 17 and 18 (ASTR)Statr session 17 and 18 (ASTR)
Statr session 17 and 18 (ASTR)
 
Statr session 17 and 18
Statr session 17 and 18Statr session 17 and 18
Statr session 17 and 18
 
Topic 7 stat inference
Topic 7 stat inferenceTopic 7 stat inference
Topic 7 stat inference
 
Malimu statistical significance testing.
Malimu statistical significance testing.Malimu statistical significance testing.
Malimu statistical significance testing.
 
Sample size
Sample sizeSample size
Sample size
 
Midterm 3 review
Midterm 3 reviewMidterm 3 review
Midterm 3 review
 
Test of-significance : Z test , Chi square test
Test of-significance : Z test , Chi square testTest of-significance : Z test , Chi square test
Test of-significance : Z test , Chi square test
 
08 test of hypothesis large sample.ppt
08 test of hypothesis large sample.ppt08 test of hypothesis large sample.ppt
08 test of hypothesis large sample.ppt
 
Hypothesis - Biostatistics
Hypothesis - BiostatisticsHypothesis - Biostatistics
Hypothesis - Biostatistics
 
Chapter10 Revised
Chapter10 RevisedChapter10 Revised
Chapter10 Revised
 
Chapter10 Revised
Chapter10 RevisedChapter10 Revised
Chapter10 Revised
 
Chapter10 Revised
Chapter10 RevisedChapter10 Revised
Chapter10 Revised
 
Test of hypothesis 1
Test of hypothesis 1Test of hypothesis 1
Test of hypothesis 1
 

Mais de jillmitchell8778

Mais de jillmitchell8778 (20)

Revision workshop 17 january 2013
Revision workshop 17 january 2013Revision workshop 17 january 2013
Revision workshop 17 january 2013
 
Statistics lecture 13 (chapter 13)
Statistics lecture 13 (chapter 13)Statistics lecture 13 (chapter 13)
Statistics lecture 13 (chapter 13)
 
Statistics lecture 12 (chapter 12)
Statistics lecture 12 (chapter 12)Statistics lecture 12 (chapter 12)
Statistics lecture 12 (chapter 12)
 
Statistics lecture 11 (chapter 11)
Statistics lecture 11 (chapter 11)Statistics lecture 11 (chapter 11)
Statistics lecture 11 (chapter 11)
 
Statistics lecture 8 (chapter 7)
Statistics lecture 8 (chapter 7)Statistics lecture 8 (chapter 7)
Statistics lecture 8 (chapter 7)
 
Qr code lecture
Qr code lectureQr code lecture
Qr code lecture
 
Poisson lecture
Poisson lecturePoisson lecture
Poisson lecture
 
Statistics lecture 7 (ch6)
Statistics lecture 7 (ch6)Statistics lecture 7 (ch6)
Statistics lecture 7 (ch6)
 
Normal lecture
Normal lectureNormal lecture
Normal lecture
 
Binomial lecture
Binomial lectureBinomial lecture
Binomial lecture
 
Statistics lecture 6 (ch5)
Statistics lecture 6 (ch5)Statistics lecture 6 (ch5)
Statistics lecture 6 (ch5)
 
Project admin lu3
Project admin   lu3Project admin   lu3
Project admin lu3
 
Priject admin lu 2
Priject admin   lu 2Priject admin   lu 2
Priject admin lu 2
 
Project admin lu 1
Project admin   lu 1Project admin   lu 1
Project admin lu 1
 
Lu5 how to assess a business opportunity
Lu5   how to assess a business opportunityLu5   how to assess a business opportunity
Lu5 how to assess a business opportunity
 
Lu4 – life cycle stages of a business
Lu4 – life cycle stages of a businessLu4 – life cycle stages of a business
Lu4 – life cycle stages of a business
 
Statistics lecture 5 (ch4)
Statistics lecture 5 (ch4)Statistics lecture 5 (ch4)
Statistics lecture 5 (ch4)
 
Lu 3
Lu 3Lu 3
Lu 3
 
Learning unit 2
Learning unit 2Learning unit 2
Learning unit 2
 
Lu 3
Lu 3Lu 3
Lu 3
 

Último

Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Último (20)

Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 

Statistics lecture 10(ch10)

  • 1. 1
  • 2. OBJECTIVES • Recognise a suitable distribution to apply chi square test to • Conduct the goodness-of-fit test of hpothesis • Conduct the test of independence • Conduct a test of homgeneity 2
  • 3. Chi square distribution • Positively skewed • Test done on right tail only • Therefore all chi square tests are positive with one critical value only • Basic steps of hypothesis test are the same, only the test statistic and distribution have changed 3
  • 4. • Techniques used to analyse data up to now was measured on quantitative scale. • Results of tests can often be classified into categories where there is no natural order: – Categorical variable – Categories – Categorical data • Categorical data can be analysed with Chi-squared tests: – Simple random sample – Sample size reasonable large 4
  • 5. Example: • Survey of job satisfaction • Employed persons classified as satisfied, neutral, dissatisfied CATEGORICAL VARIABLE – is employee satisfaction CATEGORIES – satisfied, neutral, dissatisfied CATEGORICAL DATA – no. of employees satisfied, neutral or dissatisfied (also referred to as frequency of category) 5
  • 6. Examples 1. A persons income can be categorised as high, medium or low. Define the categorical variable, the categories and the categorical data 2. We want to investigate different types of industries, e.g. information technology, financial and transformation. Define the categorical variable, the categories and the categorical data 6
  • 7. Example answers 1. Categorical variable is income. Categories are high, medium and low. Categorical data are the no. of people who have high, medium or low income 2. Categorical variable is type of industry, categories are information technology, financial and transformation. Categorical data are the no of industries that are information tech, financial or transformation 7
  • 8. • Chi-squared goodness-of-fit test – This test describes a single population of categorical data. – The multinomial experiment studied is an extension of the binomial experiment. • There are n independent trials. • The outcome of each trial can be classified into one of k categories. • The probability pi of cell i remains constant for each trial. Moreover, p1 + p2 + … +pk = 1. – Experiment records the observed trails for each category. – Denoted by f1, f2, …, fk and f1 + f2 + … + fk = n 8
  • 9. EXAMPLE In a box of smarties you will find 6 different colours: brown, red,yellow,blue,orange and green. A random sample of smarties (6918 in total) was taken and the frequesncy of each colour was counted. The distribution of colours is given below Colour Brown Red Yellow Blue Orange Green f 1611 1172 1308 904 921 1002 Determine whether the smarties survey fits the description of a multinomial experiment 9
  • 11. To use the Χ2-tests • The goodness-of-fit test all expected frequencies must – Used to determine if the observed counts ofbe at least 5 the categories agree with the probabilities specified for each category. – Observed frequencies (f ) compared with the expected frequencies (e). Testing H0: Proportions agree with specified probabilities Alternative Decision rule: Test statistic hypothesis Reject H0 if … ( fi  ei )2 H1: H0 is not true Χ2 > Χ2k – 1;1 – α 2   ei 11
  • 12. • Example – A household detergent is marketed in three sizes: • 1 000 ml, 750 ml and 250 ml – The distributers belief that the market share of the different sizes is as follow: • 1 000 ml = 40% • 750 ml = 45% • 250 ml = 15%. – To study the effect of the economic climate on the sales of the products, 200 customers were ask to state which size they will prefer. • Survey results: – 82 customers preferred the 1 000 ml – 102 customers preferred the 750 ml – 16 customers preferred the 250 ml 12
  • 13. • Solution – The population investigated is the size preferences. – The data are in categories. – This is a multinomial experiment (three categories). – The question of interest: Are p1, p2, and p3 different from the expected 40%, 45% and 15%? 13
  • 14. • The hypotheses are: Expected frequencies are – H0: p1 = 0,40, p2 = 0,45, p3 = 0,15 all ≥ 5 – H1: At least one pi is not equal to its specified value. Are the observed and the expected frequencies the same? Expected Expected frequencies Observed values frequencies 16 15% ei = npi 40% 1000ml 82 1000ml 40% of 200 = 80 750ml 750ml 250ml 250ml 45% of 200 = 90 102 45% 15% of 200 = 30 14
  • 15. • The hypotheses are: – H0: p1 = 0,40, p2 = 0,45, p3 = 0,15 – H1: At least one pi is not equal to its specified value. Are the observed and the expected frequencies the same? Expected frequencies Expected frequencies Observed values 30 15% 16 ei = npi 40% of 200 = 80 80 40% 1000ml 82 1000ml 750ml 750ml 45% of 200 = 90 250ml 250ml 15% of 200 = 30 102 90 45% 15
  • 16. • The hypotheses are: – H0: p1 = 0,40, p2 = 0,45, p3 = 0,15 – H1: At least one pi is not equal to its specified value. ( fi  ei ) 2 2   α = 0,05 ei 0 5,9917 Χ2k – 1;1 – α (82  80) (102  90) (16  30) 2 2 2    80 90 30 Accept H0 Reject H0  8,18 Conclusion: At 5% significance level there is – Reject H0. sufficient evidence to reject the null hypothesis. At least one of the probabilities pi is different. Thus, at least two market shares have changed. 16
  • 17. Two friends were playing a board game in which a die played a big role. One of the players believed that the die was not fair. 60 tosses of the die produced the results below. Test at 5% significance level whether the die was fair. Number of dots 1 2 3 4 5 6 Number of tosses 7 6 7 18 15 7 17
  • 18. ei = npi = 60(1/6) = 10 Expected values for the six categories are: 10 10 10 10 10 10 H0: p1 = … = p6 = 1/6 H1: At least one pi ≠ 1/6  = 0,05  f  e 2  = 2  e 7 10 + … + 2 (7 10) 2 = 10 10  = 13,2 Accept H0 Reject H0 2  k 1;1   2  =  5; 0,95 = 11,07  Therefore, reject H0. The probabilities of the dots are not equal and the die was not  fair. 18
  • 19. • Chi-squared test for independence – Cross classify two categories using a contingency table. – Rows representing one category and columns representing the other category. – Each value in cell indicates the frequency in the cross classification. – Table can be any number of rows and columns: • r×c number of cells 19
  • 20. CONCEPT QUESTIONS • Questions 1 – 3 , p356 20
  • 21. • Chi-squared test for independence – H0: the two variables are independent – no relationship. – H1: the two variables are dependent – is a relationship. For a 2×2 contingency table: B A B1 B2 Total A1 f11 f12 r1 Observed A2 f21 f22 r2 frequencies Total c1 c2 n 21
  • 22. • Chi-squared test for independence – Contingency tables describe the relationship between two categorical variables. – H0: the two variables are independent – no relationship. – H1: the two variables are dependent – is a relationship. For a 2×2 contingency table: For each observed B frequency an expected frequency must be A B1 B2 Total calculated A1 f11 f12 r1 A2 f21 f22 r2 row total × column total e= Total c1 c2 n n 22
  • 23. • Chi-squared test for independence – Contingency tables describe the relationship between two categorical variables. – H0: the two variables are independent – no relationship. – H1: the two variables are dependent – is a relationship. For a 2×2 contingency tabel: B row total × column total A B1 B2 Total e= n A1 f11 f12 r1 A2 f21 f22 r2 e11  (r1  c1 ) / n ; e12  (r1  c2 ) / n Total c1 c2 n e21  (r2  c1 ) / n ; e22  (r2 23 2 ) / n c
  • 24. • Chi-squared test for independence – H0: the two variables are independent – no relationship. – H1: the two variables are dependent – is a relationship. Testing H0: Variables are independent Alternative Decision rule: Test statistic hypothesis Reject H0 if … ( f  e) 2   H1: Variables are Χ2 > Χ2(r – 1)(c – 1);1 – α 2 dependent e 24
  • 25. • Example – A household detergent is marketed in three sizes: • 1 000 ml, 750 ml and 250 ml – The market for potential buyers is divided into three age groups: • < 30 years old • 30–50 years old • > 50 years old – Market researcher believe that there is a relationship between the age of a buyer and the size of the packaging. 25
  • 26. • Solution – The data is summarised in a 3×3 contingency table. – H0: Size and age are independent. Observed – H1: Size and age are dependent. frequencies Age groups Size < 30 30–50 > 50 Total 1 000 ml 27 41 14 82 750 ml 39 18 45 102 250 ml 8 2 6 16 Total 74 61 65 200 26
  • 27. • Solution – Calculate the expected frequency – (Row total×column total)/n Expected frequency: (74×82)/200 = 30,34 Age groups Size < 30 30–50 > 50 Total 1 000 ml 27 30,34 41 25,01 14 26,65 82 750 ml 39 37,74 18 31,11 45 33,15 102 250 ml 8 5,92 2 4,88 6 5,20 16 Total 74 61 65 200 27
  • 28. Χ2(r – 1)(c – 1);1 – α = Χ2(3-1)(3-1);0.95 = 9.49 • The hypotheses are: – H0: Size and age are independent α = 0,05 – H1: Size and age are dependent 0 9,49 ( f  e) 2   2 e Accept H0 Reject H0 (27  30,34) 2 (41  25, 01) 2 (6  5, 20) 2    .....  30,34 25, 01 5, 20  28,95 – Reject H0. Conclusion: At 5% significance level there is sufficient evidence to reject the null hypothesis. There is a relationship between the size of detergent that people prefer and their age. 28
  • 29. A recent survey of marketing managers in four different industries provided the data in the table below, which gives managers attitudes to market research and its value in marketing decision making:- INDUSTRY TYPE Perceived value Consumer Industrial Retail & Finance & of M Research businesses organisations wholesale insurance Little value 9 22 13 9 Moderate value 29 41 6 17 Great value 26 28 6 27 TOTAL 64 91 25 53 Test at 1% level of significance whether manager’s perception of the value of the market research is dependent on the type of industry in which a marketing manager is employed. 29
  • 30. Industry type Perceived value of market Consumer Industrial Retail and Finance Total and research businesses organisations wholesale insurance Little value 9 (14,56) 22 (20,7) 13 (5,69) 9 (12,06) 53 Moderate value 29 (25,55) 41 (36,32) 6 (9,98) 17 (21,15) 93 Great value 26 (23,9) 28 (33,98) 6 (9,33) 27 (19,79) 87 Total 64 91 25 53 233 H0: Manager’s perception is independent of industry type. H1: Manager’s perception is dependent of industry type.  = 0,01  f  e 2  = 2  e 9 14,56 + … + 27 19,79 2 2 = 14,56 19,79  = 20,895 2 r 1c 1;1  Accept H0 Reject H0  =  2  6; 0,99 = 16,81  Therefore, reject H0. Manager’s perception is dependent on the industry type. 30 
  • 31. Questions 4 – 6, p361, textbook 31
  • 32. • Chi-squared Test of Homogeneity – Test if two or more populations are homogeneous (similar) with regard to a certain characteristic. – H0: The proportion of elements with certain characteristic in two or more different populations are the same. – H1: The proportion of elements with certain characteristic in two or more different populations are not the same. – The rest of the test is the same as the test for independence. 32
  • 33. An immigration attorney was investigating which industries to target for obtaining new clients who might have problems with change in the immigration laws. The lawyer selected five industries and twenty workers were randomly selected in each industry and their visa statuses were verified. VISA STATUS INDUSTRY A B C D E Illegal resident 8 10 5 10 1 Legal resident 4 2 6 4 9 SA citizen 8 8 9 6 10 Test at a 1% level of significance whether the 5 industries are homogeneous with respect to the visa status of their workers 33
  • 34. Industry Visa status Total A B C D E Illegal resident 8 (6,8) 10 (6,8) 5 (6,8) 10 (6,8) 1 (6,8) 34 Legal resident 4 (5) 2 (5) 6 (5) 4 (5) 9 (5) 25 SA citizen 8 (8,2) 8 (8,2) 9 (8,2) 6 (8,2) 10 (8,2) 41 Total 20 20 20 20 20 100 H0: Five industries are homogeneous with respect to the visa status of their workers. H1: Five industries are heterogeneous with respect to the visa status of their workers.  = 0,01  f  e 2 2 =  e 8  6,8 + … + 10  8,2 2 2 = 6,8 8,2  = 15,32 2 r 1c 1;1   =  2  8; 0,99 = 20,09 34  Therefore, do not reject H0. The five industries are homogeneous with respect to  the visa status of their workers.
  • 35. CLASSWORK/HOMEWORK 1. Activity 1,2,3,4 – p168 – 174, Module Manual 2. Revision exercise 1, 2, 3 4 – p174 -176, Module Manual 3. Self Review Test – 1 – 4, p 368, textbook 4. Supplementary Exercises 1 – 11, p370, textbook 35