2. Hypothesis
in statistics, is a claim or statement about
a property of a population
Hypothesis Testing
is to test the claim or statement
Example: A conjecture is made that “the
average starting salary for computer
science gradate is $30,000 per year”.
5. Question:
How can we justify/test this conjecture?
A. What do we need to know to justify
this conjecture?
B. Based on what we know, how should
we justify this conjecture?
6. Answer to A:
Randomly select, say 100, computer
science graduates and find out their
annual salaries
---- We need to have some sample
observations, i.e., a sample set!
7. Answer to B:
That is what we will learn in this
chapter
---- Make conclusions based on the
sample observations
8. Statistical Reasoning
Analyze the sample set in an attempt to
distinguish between results that can
easily occur and results that are highly
unlikely.
9. Statistical Decisions
Decisions about populations on the basis of sample information.
Ex) We may wish to decide on the basis of sample data whether a new
serum is really effective in curing a disease, or whether one educational
procedure is better than another
10. Definitions
Null Hypothesis (denoted H 0):
is the statement being tested in a
test of hypothesis.
Alternative Hypothesis (H 1):
is what is believe to be true if the
null hypothesis is false.
11. Null Hypothesis: H0
Must contain condition of equality
=, ≥, or ≤
Test the Null Hypothesis directly
Reject H 0 or fail to reject H 0
12. Alternative Hypothesis: H1
Must be true if H0 is false
≠, <, >
‘opposite’ of Null
Example:
H0 : µ = 30 versus H1 : µ > 30
13. Statistical Hypotheses and
Null Hypotheses
•Statistical hypotheses: Assumptions or guesses about the populations
involved. (Such assumptions, which may or may not be true)
•Null hypotheses (H0): Hypothesis that there is no difference between the
procedures. We formulate it if we want to decide whether one procedure is
better than another.
•Alternative hypotheses (H1): Any hypothesis that differs from a given null
hypothesis
Example 1. For example, if the null hypothesis is p = 0.5, possible
alternative hypotheses are p =0.7, or p ≠ 0.5.
14. Concepts of Hypothesis Testing
(1)…
• The two hypotheses are called the null hypothesis and
the other the alternative or research hypothesis. The
usual notation is:
pronounced
H “nought”
• H0: — the ‘null’ hypothesis
• H1: — the ‘alternative’ or ‘research’ hypothesis
• The null hypothesis (H0) will always state that the
parameter equals the value specified in the alternative
11.14
hypothesis (H1)
15. Stating Your Own Hypothesis
If you wish to support your claim, the
claim must be stated so that it becomes
the alternative hypothesis.
16. Important Notes:
H0 must always contain equality; however some
claims are not stated using equality. Therefore
sometimes the claim and H0 will not be the
same.
Ideally all claims should be stated that they are
Null Hypothesis so that the most serious error
would be a Type I error.
17. Tests of Hypotheses and Significance
“Significant”: If on the supposition that a particular hypothesis is true we
find that results observed in a random sample differ markedly from those
expected under the hypothesis on the basis of pure chance using
sampling theory, we would say that the observed differences are
significant
•We would be inclined to reject the hypothesis if the observed differences
are significant.
• Tests of hypotheses, tests of significance, or decision rules: Procedures
that enable us to decide whether to accept or reject hypotheses or to
determine whether observed samples differ significantly from expected
results
18. Type I Error
The mistake of rejecting the null hypothesis
when it is true.
The probability of doing this is called the
significance level, denoted by α (alpha).
Common choices for α: 0.05 and 0.01
Example: rejecting a perfectly good parachute
and refusing to jump
19. Type II Error
the mistake of failing to reject the null
hypothesis when it is false.
denoted by ß (beta)
Example: failing to reject a defective
parachute and jumping out of a
plane with it.
20. Table 7-2 Type I and Type II Errors
True State of Nature
The null The null
hypothesis is hypothesis is
true false
We decide to Type I error
Correct
reject the (rejecting a true
decision
null hypothesis null hypothesis)
Decision
We fail to Type II error
Correct
reject the (failing to reject
decision
null hypothesis a false null
hypothesis)
28. Type I and Type II Errors
•Type I error: If we reject a hypothesis when it happens to be true.
•Type II error: If we accept a hypothesis when it should be rejected.
•In order for any tests of hypotheses or decision rules to be good, they
must be designed so as to minimize errors of decision.
• An attempt to decrease one type of error is accompanied in general by an
increase in the other type of error. The only way to reduce both types of
error is to increase the sample size, which may or may not be possible.
29. Significant Differences
Hypothesis testing is designed to detect
significant differences: differences that did not
occur by random chance.
In the “one sample” case: we compare a random
sample (from a large group) to a population.
We compare a sample statistic to a population
parameter to see if there is a significant
difference.
30. Level of Significance ( 유의수준 )
• Level of significance: In testing a given hypothesis, the maximum
probability with which we would be willing to risk a Type I error is called the
level of significance
31. Level of Significance
•In practice a level of significance of 0.05 or 0.01 is customary, although
other values are used.
• If for example a 0.05 or 5% level of significance is chosen in designing a
test of a hypothesis, then there are about 5 chances in 100 that we would
reject the hypothesis when it should be accepted, i.e., whenever the null
hypotheses is true, we are about 95% confident that we would make the
right decision. In such cases we say that the hypothesis has been rejected
at a 0.05 level of significance, which means that we could be wrong with
probability 0.05.
33. Definition
Critical Region :
is the set of all values of the test statistic
that would cause a rejection of the null
hypothesis
34. Critical Region
• Set of all values of the test statistic that
would cause a rejection of the
null hypothesis
Critical
Region
35. Critical Region
• Set of all values of the test statistic that
would cause a rejection of the
• null hypothesis
Critical
Region
36. Critical Region
• Set of all values of the test statistic that
would cause a rejection of the
null hypothesis
Critical
Regions
37. Definition
Critical Value:
is the value (s) that separates the critical
region from the values that would not lead
to a rejection of H 0
38. Critical Value
Value (s) that separates the critical region
from the values that would not lead to a
rejection of H 0
Critical Value
( z score )
39. Critical Value
Value (s) that separates the critical region
from the values that would not lead to a
rejection of H 0
Reject H0 Fail to reject H0
Critical Value
( z score )
40. Tests Involving the Normal Distribution
- Level of confidence : 0.05
•The critical region (or region of rejection of the hypothesis or the region
of significance): The set of z scores outside the range -1.96 to 1.96
constitutes
• The region of acceptance of the hypothesis (or the region of
nonsignificance) : The set of z scores inside the range -1.96 to 1.96 could
41. Tests Involving the Normal Distribution
• Decision Rule
• When the level of confidence is 0.01, a value 2.58 should be instead of 1.96.
50. Two-tailed Test
H0: µ = 200 α is divided equally between
the two tails of the critical
H1: µ ≠ 200 region
51. Two-tailed Test
H0: µ = 200 α is divided equally between
the two tails of the critical
H1: µ ≠ 200 region
Means less than or greater than
52. Two-tailed Test
H0: µ = 200 α is divided equally between
the two tails of the critical
H1: µ ≠ 200 region
Means less than or greater than
Reject H0 Fail to reject H0 Reject H0
200
Values that differ significantly from 200
54. One-Tailed and Two-Tailed Tests
•Two-tailed tests or two-sided tests: When we display interest in extreme
values of the statistic S or its corresponding z score on both sides of the
mean, i.e., in both tails of the distribution.
• One-tailed tests or one-sided tests: When we are interested only in
extreme values to one side of the mean, i.e., in one tail of the distribution,
as, for example, when we are testing the hypothesis that one process is
better than another (which is different from testing whether one process is
better or worse than the other).
55. P Value
• The null hypothesis H0 will be an assertion that a population
parameter has a specific value, and the alternative hypothesis H1 will be
one of the following assertions:
(i) The parameter is greater than the stated value (right-tailed test).
(ii) The parameter is less than the stated value (left-tailed test).
(iii) The parameter is either greater than or less than the stated value (two-
tailed test).
• P value of the test: The probability that a value of S in the direction(s) of
H1 and as extreme as the one that actually did occur would occur if H0
were true.
60. P Value
•Small P values provide evidence for rejecting the null hypothesis in favor of
the alternative hypothesis, and large P values provide evidence for not
rejecting the null hypothesis in favor of the alternative hypothesis.
•The P value and the level of significance do not provide criteria for
rejecting or not rejecting the null hypothesis by itself, but for rejecting or
not rejecting the null hypothesis in favor of the alternative hypothesis.
• When the test statistic S is the standard normal random variable, the
table in Appendix C is sufficient to compute the P value, but when S is one
of the t, F, or chi-square random variables, all of which have different
distributions depending on their degrees of freedom, either computer
software or more extensive tables will be needed to compute the P value.
65. Our Problem:
The education department at a university has been
accused of “grade inflation” so education majors
have much higher GPAs than students in general.
GPAs of all education majors should be compared
with the GPAs of all students.
There are 1000s of education majors, far too many to
interview.
How can this be investigated without interviewing all
education majors?
66. What we know:
The average GPA for
all students is 2.70. µ = 2.70
This value is a
parameter.
To the right is the
X = 3.00
statistical information
for a random sample
s= 0.70
of education majors:
N= 117
67. Questions to ask:
Is there a difference between the parameter
(2.70) and the statistic (3.00)?
Could the observed difference have been
caused by random chance?
Is the difference real (significant)?
68. Two Possibilities:
1. The sample mean (3.00) is the same as
the pop. mean (2.70).
The difference is trivial and caused by
random chance.
1. The difference is real (significant).
Education majors are different from all
students.
69. The Null and Alternative Hypotheses:
1. Null Hypothesis (H0)
The difference is caused by random chance.
The H0 always states there is “no significant difference.” In
this case, we mean that there is no significant difference
between the population mean and the sample mean.
1. Alternative hypothesis (H1)
“The difference is real”.
(H1) always contradicts the H0.
One (and only one) of these explanations must be true.
Which one?
70. Test the Explanations
We always test the Null Hypothesis.
Assuming that the H0 is true:
What is the probability of getting the sample
mean (3.00) if the H0 is true and all education
majors really have a mean of 2.70? In other
words, the difference between the means is
due to random chance.
If the probability associated with this difference
is less than 0.05, reject the null hypothesis.
71. Test the Hypotheses
Use the .05 value as a guideline to identify differences
that would be rare or extremely unlikely if H0 is true.
This “alpha” value delineates the “region of rejection.”
Use the Z score formula for single samples and
Appendix A to determine the probability of getting the
observed difference.
If the probability is less than .05, the calculated or
“observed” Z score will be beyond ±1.96 (the “critical”
Z score).
72. Two-tailed Hypothesis Test:
Z= -1.96 Z = +1.96
c c
When α = .05, then .025 of the area is distributed on either
side of the curve in area (C )
The .95 in the middle section represents no significant
difference between the population and the sample mean.
The cut-off between the middle section and +/- .025 is
represented by a Z-value of +/- 1.96.
73. Testing Hypotheses:
Using The Five Step Model…
1. Make Assumptions and meet test
requirements.
2. State the null hypothesis.
3. Select the sampling distribution and
establish the critical region.
4. Compute the test statistic.
5. Make a decision and interpret results.
74. Step 1: Make Assumptions and Meet
Test Requirements
Random sampling
Hypothesis testing assumes samples were selected using
random sampling.
In this case, the sample of 117 cases was randomly selected
from all education majors.
Level of Measurement is Interval-Ratio
GPA is I-R so the mean is an appropriate statistic.
Sampling Distribution is normal in shape
This is a “large” sample (N≥100).
75. Step 2 State the Null Hypothesis
H0: μ = 2.7 (in other words, H0: = μ)
You can also state Ho: No difference between the sample
mean and the population parameter
(In other words, the sample mean of 3.0 really the same as
the population mean of 2.7 – the difference is not real but
is due to chance.)
The sample of 117 comes from a population that has a
GPA of 2.7.
The difference between 2.7 and 3.0 is trivial and caused by
random chance.
76. Step 2 (cont.) State the Alternate Hypothesis
H1: μ≠2.7 (or, H0: ≠ μ)
Or H1: There is a difference between the sample mean and
the population parameter
The sample of 117 comes a population that does not have
a GPA of 2.7. In reality, it comes from a different population.
The difference between 2.7 and 3.0 reflects an actual
difference between education majors and other students.
Note that we are testing whether the population the sample
comes from is from a different population or is the same as
the general student population.
77. Step 3 Select Sampling Distribution and
Establish the Critical Region
Sampling Distribution= Z
Alpha (α) = .05
α is the indicator of “rare” events.
Any difference with a probability less than α
is rare and will cause us to reject the H 0.
78. Step 3 (cont.) Select Sampling Distribution
and Establish the Critical Region
Critical Region begins at Z= ± 1.96
This is the critical Z score associated
with α = .05, two-tailed test.
If the obtained Z score falls in the Critical
Region, or “the region of rejection,” then
we would reject the H0.
79. Step 4: Use Formula to Compute the Test
Statistic (Z for large samples (≥ 100)
Χ− µ
Z=
σ N
80. When the Population σ is not known,
use the following formula:
Χ−µ
Z=
s N −1
81. Test the Hypotheses
3.0 − 2.7
Z= = 4.62
.7
117 − 1
We can substitute the sample standard deviation
S for σ (pop. s.d.) and correct for bias by
substituting N-1 in the denominator.
Substituting the values into the formula, we
calculate a Z score of 4.62.
82. Step 5 Make a Decision and Interpret
Results
The obtained Z score fell in the Critical Region, so we reject
the H0.
If the H0 were true, a sample outcome of 3.00 would be
unlikely.
Therefore, the H0 is false and must be rejected.
Education majors have a GPA that is significantly different
from the general student body (Z = 4.62, α = .05).*
*Note: Always report significant statistics.
83. Looking at the curve:
(Area C = Critical Region when α=.05)
Z= -1.96 Z = +1.96
c c z= +4.62
I
84. Summary:
The GPA of education majors is significantly
different from the GPA of the general student body.
In hypothesis testing, we try to identify statistically
significant differences that did not occur by random
chance.
In this example, the difference between the
parameter 2.70 and the statistic 3.00 was large and
unlikely (p < .05) to have occurred by random
chance.
85. Summary (cont.)
We rejected the H0 and concluded that the
difference was significant.
It is very likely that Education majors have
GPAs higher than the general student body
101. Using the Student’s t Distribution for
Small Samples (One Sample T-Test)
When the sample size is small
(approximately < 100) then the Student’s t
distribution should be used (see Appendix B)
The test statistic is known as “t”.
The curve of the t distribution is flatter than
that of the Z distribution but as the sample
size increases, the t-curve starts to resemble
the Z-curve (see text p. 230 for illustration)
102. Degrees of Freedom
The curve of the t distribution varies with
sample size (the smaller the size, the flatter
the curve)
In using the t-table, we use “degrees of
freedom” based on the sample size.
For a one-sample test, df = N – 1.
When looking at the table, find the t-value for
the appropriate df = N-1. This will be the
cutoff point for your critical region.
104. Example
A random sample of 26 sociology
graduates scored 458 on the GRE
advanced sociology test with a standard
deviation of 20. Is this significantly
different from the population average
(µ = 440)?
105. Solution (using five step model)
Step 1: Make Assumptions and Meet Test
Requirements:
1. Random sample
2. Level of measurement is interval-ratio
3. The sample is small (<100)
106. Solution (cont.)
Step 2: State the null and alternate hypotheses.
H0: µ = 440 (or H0: = μ)
H1: µ ≠ 440
107. Solution (cont.)
Step 3: Select Sampling Distribution and
Establish the Critical Region
1. Small sample, I-R level, so use t
distribution.
2. Alpha (α) = .05
3. Degrees of Freedom = N-1 = 26-1 = 25
4. Critical t = ±2.060
108. Solution (cont.)
Step 4: Use Formula to Compute the Test Statistic
Χ−µ
458 − 440
t= = = 4.5
S 20
N −1 26 − 1
109. Looking at the curve for the t distribution
Alpha (α) = .05
t= -2.060 t = +2.060
c c t= +4.50
I
110. Step 5 Make a Decision and Interpret
Results
The obtained t score fell in the Critical Region, so
we reject the H0 (t (obtained) > t (critical)
If the H0 were true, a sample outcome of 458
would be unlikely.
Therefore, the H0 is false and must be rejected.
Sociology graduates have a GRE score that is
significantly different from the general student body
(t = 4.5, df = 25, α = .05).
111. Testing Sample Proportions:
When your variable is at the nominal (or
ordinal) level the one sample z-test for
proportions should be used.
If the data are in % format, convert to a
proportion first.
The method is the same as the one sample
Z-test for means (see above)