Basics of Hypothesis Testing

Elementary Statistics
Chapter 8: Hypothesis
Testing
8.1 Basics of Hypothesis
Testing
1

Chapter 8: Hypothesis Testing
8.1 Basics of Hypothesis Testing
8.2 Testing a Claim about a Proportion
8.3 Testing a Claim About a Mean
8.4 Testing a Claim About a Standard Deviation or Variance
2
Objectives:
• Understand the definitions used in hypothesis testing.
• State the null and alternative hypotheses.
• State the steps used in hypothesis testing.
• Test proportions, using the z test.
• Test means when  is known, using the z test.
• Test means when  is unknown, using the t test.
• Test variances or standard deviations, using the chi-square test.
• Test hypotheses, using confidence intervals.

Key Concept: Key components of a formal hypothesis test. The concepts in this section are general and
apply to hypothesis tests involving proportions, means, or standard deviations or variances.
Hypothesis: In statistics, a hypothesis is a claim or statement about a property of a population.
Hypothesis Test: A hypothesis test (or test of significance) is a procedure for testing a claim about a
property of a population.
Researchers are interested in answering many types of questions. For example,
 Is the earth warming up?
 Does a new medication lower blood pressure?
 Does the public prefer a certain color in a new fashion line?
 Is a new teaching technique better than a traditional one?
 Do seat belts reduce the severity of injuries?
These types of questions can be addressed through statistical hypothesis testing, which is a decision-
making process for evaluating claims about a population.
3

2000 consumers were asked if they are comfortable with having drones deliver
their purchases, and 54% (or 1080) of them responded with “no.” Using p to
denote the proportion of consumers not comfortable with drone deliveries, the
“majority” claim is equivalent to the claim that the proportion is greater than
half, or p > 0.5. The expression p > 0.5 is the symbolic form of the original
claim.
Example 1
The Claim: The population proportion p: p > 0.5.
Among 2000 consumers, how many do we need to get a significantly high number
who are not comfortable with drone delivery?
A result of 1020 (or 51%) is just barely more than half, so 1020 is clearly not
significantly high.
A result of 1900 (or 95%) is clearly significantly high.
What about the result of 1080 (or 54.0%)?
Is 1080 (or 54.0%) significantly high? The method of hypothesis testing allows us to
answer that key question.
4

A statistical hypothesis is a conjecture (assumption) about a population
parameter. This conjecture may or may not be true.
The null hypothesis, symbolized by H0, is a statistical hypothesis that states that
there is no difference between a parameter and a specific value, or that there is
no difference between two parameters.
(It states that the value of a population parameter such as proportion, mean, or
standard deviation is equal to some claimed value.)
The alternative hypothesis, symbolized by H1 , or Ha ,or HA , is a statistical
hypothesis that states the existence of a difference between a parameter and a
specific value, or states that there is a difference between two parameters.
(It states that the parameter has a value that somehow differs from the null
hypothesis. the symbolic form of the alternative hypothesis: <, >, ≠)
5

The critical value, C.V., separates the critical region
from the noncritical region.
The critical or rejection region is the range of values of
the test value that indicates that there is a significant
difference and that the null hypothesis should be
rejected.
The noncritical or nonrejection region is the range of
values of the test value that indicates that the difference
was probably due to chance and that the null hypothesis
should not be rejected.
Two-tailed test: The critical region is in the two extreme
regions (tails) under the curve.
Left-tailed test: The critical region is in the extreme left
region (tail) under the curve.
Right-tailed test: The critical region is in the extreme right
region (tail) under the curve.
6
8.1 Basics of Hypothesis Testing, Two-Tailed (2TT), Left-Tailed (LTT), Right-Tailed (RTT)

a. A researcher is interested in
finding out whether a new
medication will have a side
effect on the pulse rate of the
patients who take the
medication. Will the pulse
rate increase, decrease, or
remain unchanged after a
patient takes the medication?
The researcher knows that the
mean pulse rate for the
population under study is 80
beats per minute.
Example 2: Identify the null and alternative Hypotheses.
H0 : 𝜇 = 80 H1 : 𝜇 ≠ 80
This is called a two-tailed
(2TT) hypothesis test.
7
b. A chemist invents a
chemical to increase
the life of an
automobile battery.
The mean lifetime of
the automobile
battery without the
additive is 34 months.
H0 : 𝜇 = 34 H1 : 𝜇 > 34
This is called a Right-tailed
(RTT) hypothesis test.
c. A contractor wishes
to lower heating bills
by using a special
type of insulation in
houses. If the average
of the monthly
heating bills is $70,
her hypotheses about
heating costs with the
use of insulation are
H0 : 𝜇 = 70 H1 : 𝜇 < 70
This is called a Left-tailed
(LTT) hypothesis test.

1. The traditional method (Critical Value Method) (CV)
The Critical Value-Method, separates the critical region from the noncritical region.
2. The P-value method
P-Value Method: In a hypothesis test, the P-value is the probability of getting a value of the test statistic
that is at least as extreme as the test statistic obtained from the sample data, assuming that the null hypothesis
is true. (the P-value of a test is the probability that the value you see could have happened because of
sampling error if H0 is true.)
3. The confidence interval (CI)method
Because a confidence interval estimate of a population parameter contains the likely values of that parameter,
reject a claim that the population parameter has a value that is not included in the confidence interval.
Equivalent Methods: A confidence interval estimate of a proportion might lead to a conclusion different
from that of a hypothesis test.
8.1 Basics of Hypothesis Testing: Three methods used to test hypotheses:
8
Construct a confidence interval with
a confidence level selected:
Significance Level for
Hypothesis Test: α
Two-Tailed Test:
1 – α
One-Tailed
Test: 1 – 2α
0.01 99% 98%
0.05 95% 90%
0.10 90% 80%

In some texts Claim: Use the Original Claim to Create a Null Hypothesis H0 and
an Alternative Hypothesis H1
When a researcher conducts a study, he or she is generally looking for evidence to
support a claim. A claim can be stated as either the null hypothesis or the
alternative hypothesis.
After stating the hypotheses, the researcher’s next step is to design the study:
a. The researcher selects the correct statistical test: (such as z, t, χ²).
b. chooses an appropriate level of significance: α
c. and formulates a plan for conducting the study: CV method, P-value method, CI
(Confidence Interval) method
A statistical test uses the data obtained from a sample to make a decision about
whether the null hypothesis should be rejected. The numerical value obtained from a
statistical test is called the test statistic (value).
The null hypothesis may or may not be true, and a decision is made to reject or not to
reject it on the basis of the data obtained from a sample.
9

a. Finding the Critical Value for α = 0.01
(Right-Tailed Test)
b. Finding the Critical Value for α = 0.01
(Left-Tailed Test)
c. Finding the Critical Value for α = 0.01
(Two-Tailed Test)
Example 3: z-table
10
z = 2.33 for α = 0.01 (RTT)
Because of
symmetry,
z = –2.33 for
α = 0.01
(LTT)
z = ±2.575

Find the critical value(s) for each situation and draw
the appropriate figure, showing the critical region.
a. A left-tailed test with α = 0.10.
b. A two-tailed test with α = 0.02.
c. A right-tailed test with α = 0.005.
Example 4: z-table
Solution: z = –1.28
11
Solution: z = 2.575 or 2.58
Solution: z = ± 2.33

Type I and Type II Errors
Type I error (False positive): The mistake of rejecting the null hypothesis when it is actually true. The
symbol α (alpha) is used to represent the probability of a type I error. (A type I error occurs if one
rejects the null hypothesis when it is true.)
The level of significance is the maximum probability of committing a type I error: α = P(type I
error) = P(rejecting H0 when H0 is true) and Typical significance levels are: 0.10, 0.05, and 0.01
For example, when a = 0.10, there is a 10% chance of rejecting a true null hypothesis.
Type II error (False negative): The mistake of failing to reject the null hypothesis when it is actually
false. The symbol β (beta) is used to represent the probability of a type II error. (A type II error occurs
if one does not reject the null hypothesis when it is false.) β = P(type II error) = P(failing to reject H0
when H0 is false)
12
Preliminary
Conclusion
True State of Nature
Null hypothesis is true
True State of Nature
Null hypothesis is false
Reject H0 Type I error (False Positive):
Reject a true H0.
P(type I error) = α
Correct decision
(Power of the test)
1 ‒ β
Fail to
Reject H0
Correct decision
1 ‒ α
Type II error (False Negative):
Fail to reject a false H0.
P(type II error) = β

Example 5: Describing Type I and Type II Errors
Consider the claim that a medical procedure designed to increase the likelihood of a baby girl is effective, so that
the probability of a baby girl is p > 0.5. Given the following null and alternative hypotheses, write statements
describing a) type I error, and (b) a type II error.
(HINT FOR DESCRIBING TYPE I AND TYPE II ERRORS: Descriptions of a type I error and a
type II error refer to the null hypothesis being true or false, but when wording a statement representing
a type I error or a type II error, be sure that the conclusion addresses the original claim (which may
or may not be the null hypothesis).
H0: p = 0.5
H1: p > 0.5 (original claim that will be addressed in the final conclusion)
13
Solution
a. Type I Error: A type I error is the mistake of rejecting a true null hypothesis, so
the following is a type I error: In reality p = 0.5, but sample evidence leads us to
conclude that p > 0.5. (In this case, a type I error is to conclude that the medical
procedure is effective when in reality it has no effect.)
b. Type II Error: A type II error is the mistake of failing to reject the null hypothesis
when it is false, so the following is a type II error: In reality p > 0.5, but we fail to
support that conclusion. (In this case, a type II error is to conclude that the
medical procedure has no effect, when it really is effective in increasing the
likelihood of a baby girl.)

Critical Value (CV) : Procedure for Hypothesis Tests P-value
14
Step 1 State the null and alternative
hypotheses and identify the claim (H0 , H1).
Step 2 Test Statistic (TS): Compute
the test statistic value that is relevant to
the test and determine its sampling
distribution (such as normal, t, χ²).
Step 3 Critical Value (CV) :
Find the critical value(s) from the
appropriate table.
Step 4 Make the decision to
a. Reject or not reject the null
hypothesis.
b. The claim is true or false
c. Restate this decision: There is /
is not sufficient evidence to
support the claim that…
Step 1 State the null and alternative
hypotheses and identify the claim (H0 , H1).
Step 2 Test Statistic (TS): Compute
the test statistic value that is relevant to
the test and determine its sampling
distribution (such as normal, t, χ²).
Step 3 P-value Method:
Find the P-value from the appropriate
table.
Step 4 Make the decision to
a. Reject or not reject the null
hypothesis.
c. Restate this decision: There is /
is not sufficient evidence to
support the claim that…

15
1009 consumers were asked if they are comfortable with having drones deliver their
purchases, and 54% (or 545) of them responded with “no.” Use these results to test the
claim that most consumers are uncomfortable with drone deliveries. We interpret
“most” to mean “more than half” or “greater than 0.5.” (α = 0.05)
Example 6
Step 1:
State H0 , H1, Identify the claim & Tails
Step 2: TS
Calculate the test statistic (TS) that is
relevant to the test
Step 3: CV
Find the critical value /s using α
Step 4: Make the decision to
a. Reject or not H0
c. Restate this decision: There is / is not
sufficient evidence to support the claim
that…
Step 3: CV: α = 0.05 →CV: z = 1.645
Step 1: H0: p = 0.5, H1: p > 0.5, RTT, Claim
Given: BD, n = 1009, p = 0.5
→ q = 0.5, → np ≥ 5 and nq ≥
5 → Use ND, x = 545
𝑝 =
𝑥
𝑛
=
545
1009
= 0.540
Step 2: 𝑇𝑆: 𝑧 =
(545/1009)−0.5
0.5(0.5)
1009
𝑧 =
𝑝 − 𝑝
𝑝𝑞
𝑛
= 2.55
Step 4: Decision:
a. Reject H0
b. The claim is true
c. There is sufficient evidence to support the claim that the
majority of consumers are uncomfortable with drone deliveries.

Identify the test statistic that is relevant to the test and determine its
sampling distribution (such as z, t, χ²)
16
Parameter
Sampling
Distribution
Requirements Test Statistics
Proportion:
p
Normal (Z) 𝑛𝑝 ≥ 5, & 𝑛𝑞 ≥ 5
𝑍 =
𝑝 − 𝑝
𝑝𝑞/𝑛
Mean: 𝝁 t 𝝈 not known & Normally
Distributed Population or
𝑛 > 30
𝑡 =
𝑥 − 𝜇
𝑠/ 𝑛
Mean: 𝝁 Normal (Z) 𝝈 known & Normally
Distributed Population or
𝑛 > 30
𝑍 =
𝑥 − 𝜇
𝜎/ 𝑛
Standard
Deviation: 𝝈
Or Variance: 𝝈𝟐
𝝌𝟐 Normally Distributed
Population 𝜒2
=
(𝑛 − 1)𝑠2
𝜎2

Finding P-Value
17
P-value = probability of a test statistic at least as extreme as the one obtained
p = population proportion
𝑝: 𝑆𝑎𝑚𝑝𝑙𝑒 𝑃𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛
Example 6 continued: RTT
Test statistic: z = 2.55
Right-tailed Test: p(z > 2.550) = 0.0054
( Normal distribution area to its right)
P-value of 0.0054.
Decision Criteria for the P-Value Method:
If P-value ≤ α, reject H0 (“If the P is low,
the null must go.”)
If P-value > α, fail to reject H0.

Example 7: Power of a Hypothesis (Time)
18
The power of a hypothesis test is the probability 1 − β of rejecting a false null
hypothesis. The value of the power is computed by using a particular significance level
α and a particular value of the population parameter that is an alternative to the value
assumed true in the null hypothesis.
Use a significance level of α = 0.05 and these particular value of p that is an alternative
to the value assumed in the null hypothesis H0: p = 0.5, to find the values of power of
the test.
p: 0.6, 0.7, 0.8, and 0.9.
Consider these preliminary results from the XSORT method of gender selection: There
were 13 girls among the 14 babies born to couples using the XSORT method. If we want
to test the claim that girls are more likely (p > 0.5) with the XSORT method, we have
the following null and alternative hypotheses:
H0: p = 0.5 H1: p > 0.5

Solution: The values of power in the following table were found by using
Minitab, and exact calculations are used instead of a normal approximation to the
binomial distribution. Specific Alternative
Value of p
β
Power of Test
= 1 − β
0.6 0.820 0.180
0.7 0.564 0.436
0.8 0.227 0.773
0.9 0.012 0.988
19
Example 7: Power of a Hypothesis
Interpretation: p = 0.7: There is a 0.436 probability of rejecting p = 0.5 when the true value of p is actually 0.7. It makes sense
that this test is more effective in rejecting the claim of p = 0.5 when the population proportion is actually 0.7 than when the
population proportion is actually 0.6.
In general, increasing the difference between the assumed parameter value and the actual parameter value results in an increase in
power.
Interpretation: p = 0.6: We see that this hypothesis test has
power of 0.180 (or 18.0%) of rejecting H0: p = 0.5 when the
population proportion p is actually 0.6. That is, if the true
population proportion is actually equal to 0.6, there is an 18.0%
chance of making the correct conclusion of rejecting the false null
hypothesis that p = 0.5. That low power of 18.0% is not so good.
Interpretation: p = 0.8: Just as 0.05 is a common choice for a significance level, a power of at least 0.80 is a common
requirement for determining that a hypothesis test is effective. (Some statisticians argue that the power should be higher, such as
0.85 or 0.90.) When designing an experiment, we might consider how much of a difference between the claimed value of a
parameter and its true value is an important amount of difference.
When designing an experiment, a goal of having a power value of at least 0.80 can often be used to determine the minimum
required sample size, as in the following example.

Example 8: Finding the Sample Size Required to Achieve 80% Power
A trial design assumed that with a 0.05 significance level, 153 randomly selected
subjects would be needed to achieve 80% power to detect a reduction in the
coronary heart disease rate from 0.5 to 0.4. From that statement, we know the
following:
• Before conducting the experiment, the researchers selected a significance level of 0.05
and a power of at least 0.80.
20
• The researchers decided that a reduction in the proportion of coronary heart disease
from 0.5 to 0.4 is an important difference that they wanted to detect.
• Using a significance level of 0.05, power of 0.80, and the alternative proportion of 0.4,
technology such as Minitab is used to find that the required minimum sample size is
153.
The researchers can then proceed by obtaining a sample of at least 153 randomly
selected subjects. Because of factors such as dropout rates, the researchers are
likely to need somewhat more than 153 subjects.

Example 1: Majority of Consumers are not Comfortable with
Drone Deliveries
Using Technology It is easy to obtain hypothesis-testing results using
technology. The accompanying screen displays show results from four
different technologies, so we can use computers or calculators to do all of
the computational heavy lifting.
21

Basics of Hypothesis Testing

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Basics of Hypothesis Testing

Semelhante a Basics of Hypothesis Testing (20)

Mais de Long Beach City College

Mais de Long Beach City College (13)

Último

Último (20)

Basics of Hypothesis Testing