Hypothesis testing1

HYPOTHESIS TESTING
DR. HANAA ELSAYED BAYOMY
ASSOTIATE PROFESSOR OF COMMUNITY MEDICINE

CONTENTS
• HYPOTHESIS TESTING STEPS
• TESTING A CLAIM ABOUT A MEAN: σ KNOWN
• TESTING A CLAIM ABOUT A MEAN: σ NOT KNOWN
• TESTING A CLAIM ABOUT A PROPORTION
• TESTS OF SIGNIFICANCE: HOW TO MAKE A DECISION?

HYPOTHESIS TESTING
• Is also called significance testing
• Tests a claim about a parameter using evidence (data in a sample)

Hypothesis Testing Steps
A. Formulate null and alternative hypotheses
B. Test statistic
C. P-value and interpretation
D. Significance level (optional)

• Convert the research question to null and alternative hypotheses.
HA: Research (Alternative) Hypothesis
• What we aim to gather evidence of
• Typically that there is a difference/effect/relationship etc.
H0: Null Hypothesis
• What we assume is true to begin with
• Typically that there is no difference/effect/relationship etc.
The hypothesis testing procedure uses data from a sample to seek
evidence against H0 as a way of bolstering Ha (deduction)

Alternative Hypothesis as a Research Hypothesis
• It is the conclusion that the researcher hopes to support.
• The conclusion that the research hypothesis is true. It is made if the
sample data provide sufficient evidence to show that the null hypothesis
can be rejected.
Research hypothesis Alternative hypothesis Null hypothesis
• A new teaching method is
believed to be better than
the current method.
The new teaching method is
better.
The new method is no better
than the old method.
• A new drug is believed to
lower blood pressure more
than the existing drug.
The new drug lowers blood
pressure more than the existing
drug.
The new drug does not lower
blood pressure more than the
existing drug.

Null Hypothesis as an Assumption to be Challenged
• We might begin with an assumption that a population parameter is true.
• We then using a hypothesis test to challenge the assumption and
determine if there is statistical evidence to conclude that the assumption is
incorrect.
• In these situations, it is helpful to develop the null hypothesis first.
Research hypothesis Null hypothesis alternative hypothesis
We take a sample to
prove that the average
weight of all apples in an
orchard is μ =149 grams.
The average weight of
sampled apples is
correct (μ ≥149 grams).
The average weight of
sampled apples is
incorrect μ < 149 grams).

Forms for Null and Alternative
Hypotheses about a Population Mean
• In general, a hypothesis test about the value of a population
mean  must take one of the following three forms (where 0 is
the hypothesized value of the population mean).
One-tailed
(lower-tail)
One-tailed
(upper-tail)
Two-tailed
Ho:μ ≥ μo
Ha:μ < μo
Ho:μ ≤ μo
Ha:μ > μo
Ho:μ = μo
Ha:μ ≠ μo

Type I Error
• Because hypothesis tests are based on sample data, we must
allow for the possibility of errors.
• A Type I error is rejecting H0 when it is true.
• The probability of making a Type I error when the null hypothesis
is true as an equality is called the level of significance.
• Applications of hypothesis testing that only control the Type I error
are often called significance tests.

Type II Error
• A Type II error is accepting H0 when it is false.
• It is difficult to control for the probability of making a Type II
error.
• Statisticians avoid the risk of making a Type II error by using “do
not reject H0” and not “accept H0”.

Type I and Type II Errors
CONCLUSION
POPULATION CONDITION
HO TRUE HO FALSE
ACCEPT HO CORRECT DECISION TYPE II ERROR
REJECT HO TYPE I ERROR CORRECT DECISION

Controlling Type I and Type II Errors
 For any fixed , an increase in the sample size n will
cause a decrease in 
 For any fixed sample size n, a decrease in  will cause an
increase in . Conversely, an increase in  will cause a
decrease in .
 To decrease both  and , increase the sample size.

Choose Level of Significance
Power of a Test
• It is necessary to balance the two types of errors.
• The power of a test is the probability (1 - β) of rejecting the null
hypothesis when it is false and should be rejected.
• Although β is unknown, it is related to α. An extremely low value of
α (e.g., = 0.001) will result in intolerably high β errors.

Hypothesis Testing Steps
A. Null and alternative hypotheses
B. Test statistic
C. P-value and interpretation
D. Significance level (optional)

Test Statistic
 The test statistic is a value used in making a decision
about the null hypothesis.
 By converting the sample statistic to a score with the
assumption that the null hypothesis is true.
 The critical region (or rejection region) is the set of all
values of the test statistic that cause us to reject the
null hypothesis.

TEST STATISTIC/CRITICAL REGION
 The test statistic measures how close the sample has come to the H0
 The test statistic often follows a well-known distribution (e.g.
normal, t, or chi-square). E.g. the z-statistic follows the normal
distribution.
95% of values
Ho is true
Reject HoReject Ho

P-VALUE/ α-LEVEL/LEVEL OF SIGNIFICANCE

INTERPRETATION OF P-VALUE
• The significance level (denoted by ) is the probability that the test
statistic will fall in the critical region when the null hypothesis is
actually true.
• A critical value is any value that separates the critical region (where
we reject the null hypothesis) from the values of the test statistic that
do not lead to rejection of the null hypothesis.
• P > 0.10  non-significant evidence against H0
• 0.05 < P  0.10  marginally significant evidence
• 0.01 < P  0.05  significant evidence against H0
• P  0.01  highly significant evidence against H0

Testing a Claim About a Mean:  Known

Requirements for Testing Claims About a
Population Mean (with  Known)
1. The sample is a simple random sample.
2. The value of the population standard deviation  is
known.
3. Either or both of these conditions is satisfied: The
population is normally distributed or n > 30.

n
SE
H
SE
x
x
x







and
trueisassumingmeanpopulationwhere
z
00
0
stat
USE THE FOLLOWING TEST STATISTIC

EXAMPLE
1-FORMULATE HYPOTHESES
• The problem: In the 1970s, 20–29 year old men in the U.S. had a
mean μ body weight of 170 pounds. Standard deviation σ was 40
pounds. We test whether mean body weight in the population now
differs.
• Null hypothesis H0: μ = 170 (“no difference”)
• The alternative hypothesis can be either
Ha: μ > 170 (one-sided test) or
Ha: μ ≠ 170 (two-sided test)

EXAMPLE
2- TEST STATISTIC
• μ0 = 170
• We know σ = 40
• Take an n = 64. Therefore
• If we found a sample mean of 173, then
• If we found a sample mean of 185, then
5
64
40

n
SEx

60.0
5
1701730
stat 




xSE
x
z

00.3
5
1701850
stat 




xSE
x
z


CENTRAL LIMIT THEOREM
• NO MATTER THE SHAPE OF THE POPULATION, THE DISTRIBUTION OF
X-BARS TENDS TOWARD NORMALITY.

EXAMPLE
3- P-VALUE
• The P-value answer the question: What is the probability of the
observed test statistic or one more extreme when H0 is true?
• This corresponds to the AUC in the tail of the Standard Normal
distribution beyond the zstat.
• Convert z statistics to P-value :
For Ha: μ > μ0  P = Pr(Z > zstat) = right-tail beyond zstat
For Ha: μ < μ0  P = Pr(Z < zstat) = left tail beyond zstat
For Ha: μ μ0  P = 2 × one-tailed P-value

One-sided P-value for zstat of 0.6
• There is NO sufficient
statistical evidence to infer
that mean body weight in
the population now differs.

One-sided P-value for zstat of 3.0
• There is sufficient statistical
evidence to infer that mean
body weight in the population
now differs.

Two-Sided P-Value
• One-sided Ha  AUC in tail beyond zstat
• Two-sided Ha  consider potential
deviations in both directions  double the
one-sided P-value
• Examples: If one-sided P = 0.0010, then two-sided P = 2 × 0.0010 = 0.0020.
If one-sided P = 0.2743, then two-sided P = 2 × 0.2743 = 0.5486.

Testing a Claim About a Mean:
 Not Known

Requirements for Testing Claims About a Mean:  Not
Known.
1) The sample is a simple random sample.
2) The value of the population standard deviation  is not
known.
3) Either or both of these conditions is satisfied: The
population is normally distributed or n > 30.

Testing a Claim About a Mean:  Not Known
1- FORMULATE HYPOTHESES
• Rejection Rule: p -Value Approach
Reject H0 if p –value < a
• Rejection Rule: Critical Value Approach
H0: μ ≥ μo Reject H0 if t < -tα
H0: μ ≤ μo Reject H0 if t > tα
H0: μ = μo Reject H0 if t < - tα/2 or t > tα/2

2-USE THE FOLLOWING TEST STATISTIC
• This test statistic has a t distribution with n - 1 degrees of
freedom (df).
• The t dist. is similar to the normal distribution: bell-shaped and
symmetric.
• As the number of df increases, the t dist. approaches the
normal dist.
t
x
s n

 0
/

EXAMPLE
• A State Highway Patrol periodically samples vehicle speeds at
various locations on a particular roadway. The sample of vehicle
speeds is used to test the hypothesis H0: m < 65. The locations
where H0 is rejected are deemed the best locations for radar
traps.
• At Location F, a sample of 64 vehicles shows a mean speed of
66.2 mph with a standard deviation of 4.2 mph. Use a = .05 to
test the hypothesis.

EXAMPLE
1. Determine the hypotheses (ONE-TAILED).
• H0:  < 65
• Ha:  > 65
2. Compute the value of the test statistic.
3. Specify the level of significance.
•  = .05
 
  0 66.2 65
2.286
/ 4.2/ 64
x
t
s n

EXAMPLE
For two-tailed test, use twice
the p-value.
Reject H0

t =
1.669
t
Do Not Reject H0
We are at least 95% confident that
the mean speed of vehicles at
Location F is greater than 65 mph.
For t = 2.286, and df = 64 – 1 = 63,
t.05 = 1.669, then reject H0 if t >
1.669
Because p–value <  = .05, we
reject H0.

Testing a Claim About a Proportion

TESTING A CLAIM ABOUT A POPULATION
PROPORTION
In general, a hypothesis test about the value of a Population
proportion p must take one of the following three forms (where p0
is the hypothesized value of the population proportion).
One-tailed One-tailed Two-tailed
(lower tail) (upper tail)
𝐻0: 𝑝 ≥ 𝑝0
𝐻 𝑎: 𝑝 < 𝑝0
𝐻0: p ≥ 𝑝0
𝐻 𝑎: 𝑝 < 𝑝0
𝐻0: 𝑝 = 𝑝0
𝐻 𝑎: 𝑝 ≠ 𝑝0

Requirements for Testing Claims About a
Population Proportion (p)
1) The sample observations are a simple random sample.
2) The conditions for a binomial distribution are
satisfied.
3) The conditions np  5 and nq  5 are satisfied, so the
binomial distribution of sample proportions can be
approximated be a normal distribution with µ = np and  =
npq .

Testing Claims About a Population Proportion
(p)
• Rejection Rule: p -Value Approach
Reject H0 if p –value < 
• Rejection Rule: Critical Value Approach
H0: p  p Reject H0 if z < -z
H0: p  p Reject H0 if z > z
H0: p  p  Reject H0 if z < - z or z > z

Testing Claims About a Population Proportion
(p)
p – po
pq
n
z =
n = number of trials
p = x/n (sample proportion)
po = population proportion (used in the null hypothesis)
q = 1 – p
2- TEST STATISTIC

Testing Claims About a Population Proportion (p)
EXAMPLE
• The National Safety Council estimated that 500 people would be
killed and 25,000 injured on the nation’s roads. The NSC claimed
that 50% of the accidents would be caused by drunk driving.
• A sample of 120 accidents showed that 67 were caused by drunk
driving. Use these data to test the NSC’s claim with  = .05.

EXAMPLE
1. Determine the hypotheses (TWO TAILED).
• H0: p =0.5
• Ha: p ≠0.5
2. Compute the value of the test statistic.
3. Specify the level of significance.
•  = .05
0 0(1 ) .5(1 .5)
.045644
120
p
p p
n

 
  

 
  0 (67/120) .5
1.28
.045644p
p p
z

EXAMPLE
4. Compute the p -value.
• For z = 1.28, cumulative probability = .8997
• p–value = 2(1  .8997) = .2006
5. Determine whether to reject H0.
• Because p–value = .2006 >  = .05, we cannot reject H0.

TEST ABOUT TWO CATEGORICAL VARIABLES
• The chi-squared test is used when we want to see if two categorical
variables are related
• The test statistic for the Chi-squared test uses the sum of the squared
differences between each pair of observed (O) and expected values
(E)
 



n
i i
ii
E
EO
1
2
2


Example: Titanic
• The ship Titanic sank in 1912 with the loss of most of its passengers
61.8% (809 of the 1,309 passengers and crew) died.
• Research question: Did class (of travel) affect survival?
• Null: There is NO association between class and survival
• Alternative: There IS an association between class and survival

• We can use statistical software to undertake a hypothesis test e.g.
SPSS
• One part of the output is the p-value (P)
• If P < 0.05 reject H0 => Evidence of HA being true (i.e. IS association)
• If P > 0.05 do not reject H0 (i.e. NO association)

Chi squared distribution
• The p-value is calculated using
the Chi-squared distribution for
this test
• Chi-squared is a skewed
distribution which varies
depending on the degrees of
freedom
• df = degrees of freedom= (no. of
rows – 1) x (no. of columns – 1)

TESTING ABOUT TWO NUMERICAL VARIABLES
• T-tests are used to compare two population means
₋ Paired data: same individuals studied at two different times or
under two conditions PAIRED T-TEST
₋ Independent: data collected from two separate groups
INDEPENDENT SAMPLES T-TEST

TESTING ABOUT TWO NUMERICAL VARIABLES
EXAMPLE
COMPARISON OF HOURS WORKED IN 1988 AND 2014
Paired or unpaired?
1. If the same people have reported their hours for 1988 and 2014
have PAIRED measurements of the same variable (hours)
Paired Null hypothesis: The mean of the paired differences = 0
2. If different people are used in 1988 and 2014 have independent
measurements
Independent Null hypothesis: The mean hours worked in 1988 is equal
to the mean for 2014
201419880 :  H

SPSS data entry
Paired Data Independent Groups

What is the t-distribution?
 The t-distribution is similar to the standard normal distribution but
has an additional parameter called degrees of freedom (df)
For a paired t-test, df = number of pairs – 1
For an independent t-test,
 Used for small samples and when the population standard deviation
is not known
 Small sample sizes have heavier tails
𝑑𝑓 = 𝑛 𝑔𝑟𝑜𝑢𝑝1 + 𝑛 𝑔𝑟𝑜𝑢𝑝2 − 2

Relationship to normal
• As the sample size gets big, the t-distribution matches the normal
distribution

Broad Classification of Hypothesis Tests
Means Proportions
Tests of
Association
Tests of
Differences
Hypothesis Tests
Means Proportions

Parametric tests
SCALE
Comparing BETWEEN
groups
Comparing
measurements WITHIN
the same subject
3+
2
Independen
t t-test
One way
ANOVA
2 Paired t-test
Repeated
measures
ANOVA
3+
Non-parametric
tests
Mann-
Whitney
test
Kruskal-
Wallis test
Wilcoxon
signed rank
test
Friedman
test

Categorical
Comparing BETWEEN
two groups
Comparing between
more than two groups
Z-test
Chi-square
test

Assumptions in t-Tests
• Normality: Plot histograms
• One plot of the paired differences for any paired data
• Two (One for each group) for independent samples
• Don’t have to be perfect, just roughly symmetric
• Equal Population variances: Compare sample standard deviations
• As a rough estimate, one should be no more than twice the other
• Do an F-test (Levene’s in SPSS) to formally test for differences
• However the t-test is very robust to violations of the assumptions of Normality and equal
variances, particularly for moderate (i.e. >30) and larger sample sizes

Hypothesis testing1

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Hypothesis testing1

Semelhante a Hypothesis testing1 (20)

Último

Último (20)

Hypothesis testing1