SlideShare uma empresa Scribd logo
1 de 80
Estimation
Prof Hla Hla Win
Head, Biostatistics Department, UPH
Head, PSM Department, UM 2
Objectives
• Central Limit Theorem
• Confidence Interval Estimation of the
Mean (σ known)
• Interpretation of the Confidence Interval
• Confidence Interval Estimation of the
Mean (σ unknown)
• Confidence Interval Estimation for the
Proportion
• Determining Sample Size
STATISTICS
DESCRIPTIVE
STATISTICS
INFERRENTIAL
STATISTICS
ESTIMATION SIGNIFICANCE
TESTING
CENTRAL LIMIT THEOREM
 When drawing several number of samples from a population
randomly and by replacement method, each sample mean
can be calculated from each sample
 Each sample mean may differ some extent to population
mean and in the same way for the variance and standard
deviation
 Population itself and all the samples taken from it may of
may not be Normally distributed
 When all the sample means are distributed as a
data set, this data set is Normally distributed *(1)
 The mean calculated from this data set (mean of
the means) is exactly equal to the population mean
*(2)
 The standard deviation calculated from this data
set is called the standard error (S.E) which
measure the deviation of a sample mean from the
population mean *(3)
 S.E can be calculated by;
S.E = or
 S.E is always smaller than (s)
s
√n
δ
√n
Population
µ
δ2
δ
N
Distribute
as a
data set
Normal
distribution
s = S.E
Sample 1
x
s2
s
N
Sample k
x
s2
s
N
Sample 2
x
s2
s
N
Sample 3
x
s2
s
N
Sample .
.
x
s2
s
N
x1
x2
x3
x…
xk
x = µ
Central Limit Theorem
Irrespective of the shape of the
underlying distribution of the
population, by increasing the
sample size, sample means &
proportions will approximate
normal distributions if the
sample sizes are sufficiently large.
Central Limit Theorem in action:
How large must a sample be for the
Central Limit theorem to apply?
The sample size varies according to the
shape of the population.
However, for our use, a sample size of
30 or larger will suffice.
Must sample sizes be 30 or larger for
populations that are normally distributed?
No. If the population is normally
distributed, the sample means are
normally distributed for sample sizes as
small as n=1.
How can I tell the shape of the underlying
population?
• CHECK FOR NORMALITY:
• Use descriptive statistics. Construct stem-and-leaf plots for small
or moderate-sized data sets and frequency distributions and
histograms for large data sets.
• Compute measures of central tendency (mean and median) and
compare with the theoretical and practical properties of the
normal distribution. Compute the interquartile range. Does it
approximate the 1.33 times the standard deviation?
• How are the observations in the data set distributed? Do
approximately two thirds of the observations lie between the
mean and plus or minus 1 standard deviation? Do approximately
four-fifths of the observations lie between the mean and plus or
minus 1.28 standard deviations? Do approximately 19 out of
every 20 observations lie between the mean and plus or minus 2
standard deviations?
Estimation
Confidence interval (C.I)
 Once the sample has been drawn, it can be used to estimate
characteristics of the underlying population
 Because estimates vary from sample to sample, it is
important to know how close the estimate derived from any
one sample is likely to be to the underlying population value
 One way is to construct a confidence interval around the
estimate
 * C.I = a range of values surrounding the estimate which has
a specified probability of including the true population value
 Specified probability is called the “confidence level” and the
end-points of the interval are called the “confidence limits”
 * confidence level (C.L) is set in term of “Z” value
* 1 for 68.26%
* 2 for 95.46%
* 3 for 99.74%
* 1.96 for 95%
* Lower limit = x – (C.L X S.E) (or) x – (C.L X )
* Upper limit = x + (C.L X S.E) (or) x + (C.L X )
δ
√n
δ
√n
 Confidence interval depends upon:
* the confidence level specified
* the sample size
 The larger the C.L, the larger the C.I
 The smaller the C.L, the smaller the C.I
 The larger the (n), the smaller the C.I
 The smaller the (n), the larger the C.I
“t” Distribution
 According to central limit theorem, sampling distribution of
means is Normal distribution
 When sample sizes are not sufficiently large, sampling
distribution of means is different from Normal distribution
 t-distribution is like the Normal distribution, but it is somewhat
more widely spread than the Normal curve
“Z” distribution “t” distribution
 “Z” value for “x” = = if 1 = 0.6826 are not
= if 2 = 0.9546 concerned to
= if 3 = 0.9974 sample size
 “t” value for “x” = = 1 [if n = A] = a different
= 1 [if n = B] = b value for
= 1 [if n = C] = c different n
 In practice, t-distribution is used when the population
standard deviation (δ) is not known to the researcher and we
use the estimated value from the sample (s), in the t-
distribution
x - µ
δ
x - x
s
Uses of t-distribution
 Estimation of population mean
 Significance testing of sample to population
 Significance testing of unpaired two samples
 Significance testing of paired two samples
Estimation of population mean
 It is done by calculating the “C.I”
 Lower limit = x – {C.L (for respective n) X }
 Upper limit = x + {C.L (for respective n) X }
 C.L for respective n is shown in t-table
s
√n
s
√n
How can I tell the shape of the underlying
population?
• CHECK FOR NORMALITY:
• Use descriptive statistics. Construct stem-and-leaf plots for small
or moderate-sized data sets and frequency distributions and
histograms for large data sets.
• Compute measures of central tendency (mean and median) and
compare with the theoretical and practical properties of the
normal distribution. Compute the interquartile range. Does it
approximate the 1.33 times the standard deviation?
• How are the observations in the data set distributed? Do
approximately two thirds of the observations lie between the
mean and plus or minus 1 standard deviation? Do approximately
four-fifths of the observations lie between the mean and plus or
minus 1.28 standard deviations? Do approximately 19 out of
every 20 observations lie between the mean and plus or minus 2
standard deviations?
Because I want to use Z scores to analyze
sample means.
But to use Z scores, the data must be normally
distributed.
That’s where the Central Limit Theorem steps
in.
Recall that the Central Limit Theorem states that
sample means are normally distributed regardless
of the shape of the underlying population if the
sample size is sufficiently large.
Recall from
• Z = (X - µ) ÷ σ
• If sample means are normally distributed, the Z
score formula applied to sample means would
be:
• Z = [X-bar - µX-bar ] ÷ σ X-bar
Background
• To determine µX-bar, we would need to randomly draw
out all possible samples of the given size from the
population, compute the sample means, and average
them. This task is unrealistic. Fortunately, µX-bar equals the
population mean µ, which is easier to access.
• Likewise, computing the value of σX-bar, we would have to
take all possible samples of a given size from a
population, compute the sample means, and determine
the standard deviation of sample means. This task is
also unrealistic. Fortunately, σX-bar can be computed by
using the population standard deviation divided by the
square root of the sample size.
Note:
As the sample size increases,
the standard deviation of the sample means
becomes smaller and smaller
because the population standard deviation
is being divided by larger and larger
values of the square root of n.
The ultimate benefit of the central
limit theorem is a useful version of
the Z formula for sample means.
Z Formula for Sample Means:
Z = [X-bar - µ] ÷ σ / √ n
Example:
The mean expenditure per customer at a
tire store is $85.00, with a standard
deviation of $9.00.
If a random sample of 40 customers is
taken, what is the probability that the
sample average expenditure per
customer for this sample will be
$87.00 or more?
Because the sample size is greater than 30, the central
limit theorem says the sample means are normally
distributed.
Z = [X-bar - µ] ÷ σ / √ n
Z = [$87.00 - $85.00] ÷ $9.00 / √ 40
Z = $2.00 / $1.42 = 1.41
For Z = 1.41 in the Z distribution table, the
probability is .4207.
This represents the probability of getting a mean
between $87.00 and the population mean
$85.00.
Solving for the tail of the distribution yields
.5000 - .4207 = .0793
• This is the probability of X-bar ≥ $87.00.
Interpretations
Therefore, 7.93% of the time, a random
sample of 40 customers from this
population will yield a mean expenditure of
$87.00 or more.
OR
From any random sample of 40 customers,
7.93% of them will spend on average
$87.00 or more.
Interpretations
Therefore, 7.93% of
the time, a random
sample of 40
customers from this
population will yield
a mean
expenditure of
$87.00 or more.
From any random
sample of 40
customers, 7.93%
of them will spend
on average $87.00
or more.
Solve:
Suppose that during any hour in a
large department store, the
average number of shoppers is
448, with a standard deviation
of 21 shoppers.
What is the probability that a
random sample of 49 different
shopping hours will yield a
sample mean between 441 and
446 shoppers?
Statistical Inference
Statistical Inference facilitates
decision making.
Via sample data,
we can estimate something about
our population,
such as its average value µ,
by using the corresponding
sample mean, X-bar.
Recall that µ,
the population mean to be estimated,
is a parameter,
while X-bar,
the sample mean, is a statistic.
10.35
Estimation…
•The objective of estimation is to determine
the approximate value of a population
parameter on the basis of a sample statistic.
•There are two types of estimators:
•Point Estimator
•Interval Estimator
Point Estimate
A point estimate is a statistic taken from a sample and is
used to estimate a population parameter.
However, a point estimate is only as good as the sample it
represents. If other random samples are taken from the
population, the point estimates derived from those
samples are likely to vary.
Because of variation in sample statistics, estimating a
population parameter with a confidence interval is often
preferable to using a point estimate.
Confidence Interval
A confidence interval is a range of values
within which it is estimated with some
confidence the population parameter lies.
Confidence intervals can be one or two-
tailed.
Confidence Interval to Estimate µ
• By rearranging the Z formula for sample means, a
confidence interval formula is constructed:
• X-bar +/- Z α/2 σ / √ n
• Where:
• α = the area under the normal curve outside the
confidence interval
• α/2 = the area in one-tail of the distribution outside
the confidence interval
The confidence interval formula yields a
range (interval) within which we feel
with some confidence the population
mean is located.
It is not certain that the population mean
is in the interval unless we have a 100%
confidence interval that is infinitely
wide, so wide that it is meaningless.
Confidence interval estimates for five different
samples of n=25, taken from a population where
µ=368 and σ=15
Common levels of confidence
intervals used by analysts are
90%, 95%, 98%, and 99%.
95% Confidence Interval
• For 95%
confidence, α = .05
and α / 2 = .025.
The value of Z.025
is found by looking
in the standard
normal table under
.5000 - .025 = .
4750. This area in
the table is
associated with a
Z value of 1.96.
• An alternate method:
multiply the confidence
interval, 95% by ½
(since the distribution is
symmetric and the
intervals are equal on
each side of the
population mean.
• (½) (95%) = .4750 (the
area on each side of
the mean) has a
corresponding Z value
of 1.96.
In other words, of all the possible X-bar
values along the horizontal axis of the
normal distribution curve, 95% of them
should be within a Z score of 1.96 from
the mean.
Margin of Error
Z [σ / √ n]
Example:
• A business analyst for cellular telephone
company takes a random sample of 85 bills
for a recent month and from these bills
computes a sample mean of 153 minutes. If
the company uses the sample mean of 153
minutes as an estimate for the population
mean, then the sample mean is being used
as a POINT ESTIMATE. Past history and
similar studies indicate that the population
standard deviation is 46 minutes.
• The value of Z is decided by the level of
confidence desired. A confidence level of
95% has been selected.
153 + /- 1.96( 46/ √ 85)
= 143.22 ≤ µ ≤ 162.78
• The confidence interval is constructed from the point
estimate, 153 minutes, and the margin of error of this
estimate, + / - 9.78 minutes.
• The resulting confidence interval is 143.22 ≤ µ ≤
162.78.
• The cellular telephone company business analyst is
95% confident that the average length of a call for
the population is between 143.22 and 162.78
minutes.
Interpreting a Confidence Interval
• For the previous 95% confidence interval, the following
conclusions are valid:
• I am 95% confident that the average length of a call for the
population µ, lies between 143.22 and 162.78 minutes.
• If I repeatedly obtained samples of size 85, then 95% of the
resulting confidence intervals would contain µ and 5% would not.
QUESTION: Does this confidence interval [143.22 to 162.78]
contain µ? ANSWER: I don’t know. All I can say is that this
procedure leads to an interval containing µ 95% of the time.
• I am 95% confident that my estimate of µ [namely 153 minutes] is
within 9.78 minutes of the actual value of µ. RECALL: 9.78 is the
margin of error.
Confidence Interval Estimation of
the Mean (σ Unknown)
In reality, the actual standard deviation of the population, σ, is
usually unknown.
Therefore, we use “s” (sample standard deviation) to compute
the confidence interval for the population mean, µ.
However, by using “s” in place of σ, the standard normal Z
distribution no longer applies.
Fortunately, the t-distribution will work, provided the
population we obtain the sample is normally distributed.
Be Careful! The following statement is
NOT true:
“The probability that µ lies between
143.22 and 162.78 is .95.”
Once you have inserted your sample
results into the confidence interval
formula, the word PROBABILITY
can no longer be used to describe
the resulting confidence interval.
Assumptions necessary to use t-
distribution
• Assumes random variable x is normally
distributed
• However, if sample size is large enough ( > 30),
t-distribution can be used when σ is unknown.
• But if sample size is small, evaluate the shape
of the sample data using a histogram or stem-
and-leaf.
• As the sample size increases, the t-distribution
approaches the Z distribution.
Confidence Interval using a t-distribution
X-bar +/- t α,n-1 [s / √ n
α= confidence interval
n-1 = degrees of freedom
Example:
• As a consultant I have been employed to estimate the
average amount of comp time accumulated per week for
managers in the aerospace industry.
• I randomly sample 18 managers and measure the
amount of extra time they work during a specific week
and obtain the following results (in hours). Assume a
90% confidence interval.
• AEROSPACE DATA
6 21 17 20 7 0 8 16 29
3 8 12 11 9 21 25 15 16
Solution:
To construct a 90% confidence interval to estimate the
average amount of extra time per week worked by a
manager in the aerospace industry, I assume that comp
time is normally distributed in the population.
The sample size is 18, so df = 17.
A 90% level of confidence results in an α / 2 = .05 area in
each tail.
The table t-value is t .05,17 = 1.740.
With a sample mean of 13.56 hours, and a
sample standard deviation of 7.8 hours, the
confidence interval is computed:
X-bar +/- t α/2, n-1 S / √ n
=13.56 +/- 1.740 ( 7.8 / √ 18) = 13.56 +/-
3.20
= 10.36 ≤ µ ≤ 16.76
Interpretation:
The point estimate for this problem is
13.56 hours, with an error of +/- 3.20
hours.
I am 90% confident that the average
amount of comp time accumulated by a
manager per week in this industry is
between 10.36 and 16.76 hours.
Recommendations:
From these figures, the aerospace
industry could attempt to build a
reward system for such extra work or
evaluate the regular 40-hour week to
determine how to use the normal work
hours more effectively and thus reduce
comp time.
Solve:
I own a large equipment rental company and I want to make
a quick estimate of the average number of days a piece of
ditch digging equipment is rented out per person per time.
The company has records of all rentals, but the amount of
time required to conduct an audit of all accounts would be
prohibitive.
I decide to take a random sample of rental invoices.
Fourteen different rentals of ditch diggers are selected
randomly from the files.
Use the following data to construct a 99% confidence
interval to estimate the average number of days that a
ditch digger is rented and assume that the number of days
per rental is normally distributed in the population.
Ditch Digger Data:
3 1 3 2 5 1 2 1 4
2 1 3 1 1
Stay-tuned
Estimating the Population Proportion
For most businesses, estimating market share (their
proportion of the market) is important b/c many company
decisions evolve from market share information:
• What proportion of my customers pay late?
• What proportion don’t pay at all?
• What proportion of the produced goods are
defective?
• What proportion of the population has cats/
dogs/ horses/ kids/ exercises/ reads?
Confidence Interval Estimate for the
Proportion
• ps +/- Z√ ps(1-ps) / n
• ps - Z√ps(1-ps) /n ≤ p ≤ ps + Z√ps(1-ps) /n
• ps = sample proportion = X / n = number of successes ÷
sample size. This is the POINT ESTIMATE.
• p = population proportion
• Z = critical value from the standardized normal
distribution
• n = sample size
ps +/- Z√ ps(1-ps) / n
NOTE: This formula can be applied only
when np and n(1-p) are at least 5.
Example:
A study of 87 randomly selected companies with a
telemarketing operation revealed that 39% of
the sampled companies had used telemarketing
to assist them in order processing.
Using this information, how could a researcher
estimate the population proportion of
telemarketing companies that use their
telemarketing operation to assist them in order
processing?
Solution:
• The sample proportion = .39.
• This is the point estimate of the population
proportion, p.
• The Z value for 95% confidence is 1.96.
• The value of (1-p) = 1 - .39 = .61.
ps +/- Z√ ps(1-ps) / n
ps - Z√ps(1-ps) /n ≤ p ≤ ps + Z√ps(1-ps) /n
• The confidence interval estimate is:
.39 – 1.96√(.39) (.61) / 87 ≤ p ≤ .39 + 1.96√(.39) (.61) / 87
.39 - .10 ≤ p ≤ .39 + .10
.29 ≤ p ≤ .49
Interpretation:
We are 95% confident that the population
proportion of telemarketing firms that use
their operation to assist order processing
is somewhere between .29 and .49.
There is a point estimate of .39 with a
margin of error of +/- .10.
Solve:
A clothing company produces men’s jeans. The jeans are
made and sold with either a regular cut or a boot cut.
In an effort to estimate the proportion of their men’s jeans
market in Oklahoma City that is for boot-cut jeans, the
analyst takes a random sample of 212 jeans sales from
the company’s two Oklahoma City retail outlets.
Only 34 of the sales were for boot-cut jeans.
Construct a 90% confidence interval to estimate the
proportion of the population in Oklahoma City who prefer
boot-cut jeans.
Solution:
ps = 34/212 = .16
A point estimate for boot-cut jeans is .16 or 16%.
The Z value for 90% level of confidence is 1.645.
The confidence interval estimate is:
ps - Z√ps(1-ps) /n ≤ p ≤ ps + Z√ps(1-ps) /n
.16 – 1.645√(.16) (.84) / 212 ≤ p ≤ .16 + 1.645√(.16) (.84) / 212
.16 - .04 ≤ P ≤ .16 + .04
.12 ≤ P ≤ .20
We are 90% confident that the proportion of boot-cut jeans
is between 12 and 20 %.
Estimating Sample Size
The amount of sampling error you are
willing to accept and the level of
confidence desired, determines the size
of your sample.
Sample size when Estimating µ
n = Z2
σ2
/ e2
e = Z (σ / √ n
To determine sample size:
• Know the desired confidence level, which determines the
value of Z (the critical value from the standardized normal
distribution. Determining the confidence level is subjective.
• Know the acceptable sampling error, e. The amount of error
that can be tolerated.
• Know the standard deviation, σ. If unknown, estimate by:
• past data
• educated guess
• estimate σ: [σ = range/4] This estimate is derived from the
empirical rule stating that approximately 95% of the values
in a normal distribution are within +/- 2σ of the mean,
giving a range within which most of the values are located.
Example:
Suppose the marketing manager wishes to
estimate the population mean annual usage of
home heating oil to within +/- 50 gallons of the
true value, and he wants to be 95% confident of
correctly estimating the true mean.
On the basis of a study taken the previous year,
he believes that the standard deviation can be
estimated as 325 gallons.
Find the sample size needed.
Solution:
• With e =50, σ = 325, and 95% confidence (Z = 1.96)
• n = Z2
σ2
/e2
= (1.96)2
(325)2
/ (50)2
• n = 162.31
• Therefore, n = 163. As a general rule for
determining sample size, always round up to the
next integer value in order to slightly over
satisfy the criteria desired.
Solve:
Suppose you want to estimate the average age of all
Boeing 727 airplanes now in active domestic U.S.
service.
You want to be 95% confident, and you want your estimate
to be within 2 years of the actual figure.
The 727 was first placed in service about 30 years ago,
but you believe that no active 727s in the U.S. domestic
fleet are more than 25 years old.
How large a sample should you take?
Solution:
With E = 2 years,
& Z value for 95% = 1.96,
and σ unknown,
it must be estimated by using σ ≈ range ÷ 4. As
the range of ages is 0 to 25 years, σ = 25 ÷ 4 =
6.25.
n = Z2
σ2
/e2
n = Z2
σ2
/e2
= (1.96)2
(6.25)2
/ (2)2
= 37.52 airplanes.
Because
you cannot sample 37.52 units, the required
sample size is 38.
If you randomly sample 38 planes, you can
estimate the average age of active 727s
within 2 years and be 95% confident of the
results.
Solve:
Determine the sample size necessary to
estimate µ when values range from 80 to
500, error is to be within 10, and the
confidence level is 90 %.
n = Z2
σ2
/e2
Answer: 200
Determining sample size for proportion
n = Z2
p(1-p) /e2
• p = population proportion (if unknown, analysts
use .5 as an estimate of p in the formula)
• e = error of estimation equal to (ps – p) the
difference between the sample proportion and
the parameter to be estimated, p. Represents
amount of error willing to tolerate.
Solve:
The Packer, a produce industry trade publication, wants to
survey Americans and ask whether they are eating more
fresh fruits and vegetables than they did 1 year ago.
The organization wants to be 90% confident in its results
and maintain an error within .05. How large a sample
should it take?
Estimation

Mais conteúdo relacionado

Mais procurados

Normal Probability Distribution
Normal Probability DistributionNormal Probability Distribution
Normal Probability Distribution
mandalina landy
 
The sampling distribution
The sampling distributionThe sampling distribution
The sampling distribution
Harve Abella
 
The Normal Probability Distribution
The Normal Probability DistributionThe Normal Probability Distribution
The Normal Probability Distribution
mandalina landy
 
Statistics-Measures of dispersions
Statistics-Measures of dispersionsStatistics-Measures of dispersions
Statistics-Measures of dispersions
Capricorn
 
The standard normal curve & its application in biomedical sciences
The standard normal curve & its application in biomedical sciencesThe standard normal curve & its application in biomedical sciences
The standard normal curve & its application in biomedical sciences
Abhi Manu
 
Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)
Harve Abella
 

Mais procurados (20)

Sampling distribution
Sampling distributionSampling distribution
Sampling distribution
 
Normal Probability Distribution
Normal Probability DistributionNormal Probability Distribution
Normal Probability Distribution
 
The sampling distribution
The sampling distributionThe sampling distribution
The sampling distribution
 
6. point and interval estimation
6. point and interval estimation6. point and interval estimation
6. point and interval estimation
 
The Normal Distribution
The Normal DistributionThe Normal Distribution
The Normal Distribution
 
Standard normal distribution
Standard normal distributionStandard normal distribution
Standard normal distribution
 
Bernoulli distribution
Bernoulli distributionBernoulli distribution
Bernoulli distribution
 
Estimating population mean
Estimating population meanEstimating population mean
Estimating population mean
 
The Normal Probability Distribution
The Normal Probability DistributionThe Normal Probability Distribution
The Normal Probability Distribution
 
Confidence Intervals
Confidence IntervalsConfidence Intervals
Confidence Intervals
 
Statistics-Measures of dispersions
Statistics-Measures of dispersionsStatistics-Measures of dispersions
Statistics-Measures of dispersions
 
Chi square
Chi square Chi square
Chi square
 
Normal as Approximation to Binomial
Normal as Approximation to Binomial  Normal as Approximation to Binomial
Normal as Approximation to Binomial
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
 
The standard normal curve & its application in biomedical sciences
The standard normal curve & its application in biomedical sciencesThe standard normal curve & its application in biomedical sciences
The standard normal curve & its application in biomedical sciences
 
L10 confidence intervals
L10 confidence intervalsL10 confidence intervals
L10 confidence intervals
 
Ch4 Confidence Interval
Ch4 Confidence IntervalCh4 Confidence Interval
Ch4 Confidence Interval
 
Normal distribution
Normal distributionNormal distribution
Normal distribution
 
Chapter 4 part2- Random Variables
Chapter 4 part2- Random VariablesChapter 4 part2- Random Variables
Chapter 4 part2- Random Variables
 
Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)
 

Semelhante a Estimation

Lecture 5 Sampling distribution of sample mean.pptx
Lecture 5 Sampling distribution of sample mean.pptxLecture 5 Sampling distribution of sample mean.pptx
Lecture 5 Sampling distribution of sample mean.pptx
shakirRahman10
 
Research methodology and iostatistics ppt
Research methodology and iostatistics pptResearch methodology and iostatistics ppt
Research methodology and iostatistics ppt
Nikhat Mohammadi
 

Semelhante a Estimation (20)

Inferential statistics-estimation
Inferential statistics-estimationInferential statistics-estimation
Inferential statistics-estimation
 
1.1 course notes inferential statistics
1.1 course notes inferential statistics1.1 course notes inferential statistics
1.1 course notes inferential statistics
 
Review of Chapters 1-5.ppt
Review of Chapters 1-5.pptReview of Chapters 1-5.ppt
Review of Chapters 1-5.ppt
 
Chp11 - Research Methods for Business By Authors Uma Sekaran and Roger Bougie
Chp11  - Research Methods for Business By Authors Uma Sekaran and Roger BougieChp11  - Research Methods for Business By Authors Uma Sekaran and Roger Bougie
Chp11 - Research Methods for Business By Authors Uma Sekaran and Roger Bougie
 
Lecture 5 Sampling distribution of sample mean.pptx
Lecture 5 Sampling distribution of sample mean.pptxLecture 5 Sampling distribution of sample mean.pptx
Lecture 5 Sampling distribution of sample mean.pptx
 
2. chapter ii(analyz)
2. chapter ii(analyz)2. chapter ii(analyz)
2. chapter ii(analyz)
 
descriptive data analysis
 descriptive data analysis descriptive data analysis
descriptive data analysis
 
Research methodology and iostatistics ppt
Research methodology and iostatistics pptResearch methodology and iostatistics ppt
Research methodology and iostatistics ppt
 
Chapter 7
Chapter 7 Chapter 7
Chapter 7
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distribution
 
Statistics Formulae for School Students
Statistics Formulae for School StudentsStatistics Formulae for School Students
Statistics Formulae for School Students
 
lecture-2.ppt
lecture-2.pptlecture-2.ppt
lecture-2.ppt
 
Estimating a Population Proportion
Estimating a Population Proportion  Estimating a Population Proportion
Estimating a Population Proportion
 
PG STAT 531 Lecture 2 Descriptive statistics
PG STAT 531 Lecture 2 Descriptive statisticsPG STAT 531 Lecture 2 Descriptive statistics
PG STAT 531 Lecture 2 Descriptive statistics
 
determinatiion of
determinatiion of determinatiion of
determinatiion of
 
Sampling distribution.pptx
Sampling distribution.pptxSampling distribution.pptx
Sampling distribution.pptx
 
Lect w2 measures_of_location_and_spread
Lect w2 measures_of_location_and_spreadLect w2 measures_of_location_and_spread
Lect w2 measures_of_location_and_spread
 
Statistics78 (2)
Statistics78 (2)Statistics78 (2)
Statistics78 (2)
 
Basic statistics 1
Basic statistics  1Basic statistics  1
Basic statistics 1
 
Res701 research methodology lecture 7 8-devaprakasam
Res701 research methodology lecture 7 8-devaprakasamRes701 research methodology lecture 7 8-devaprakasam
Res701 research methodology lecture 7 8-devaprakasam
 

Mais de Mmedsc Hahm

Mais de Mmedsc Hahm (20)

Solid waste-management-2858710
Solid waste-management-2858710Solid waste-management-2858710
Solid waste-management-2858710
 
Situation analysis
Situation analysisSituation analysis
Situation analysis
 
Quantification of medicines need
Quantification of medicines needQuantification of medicines need
Quantification of medicines need
 
Quality in hospital
Quality in hospitalQuality in hospital
Quality in hospital
 
Patient satisfaction & quality in health care (16.3.2016) dr.nyunt nyunt wai
Patient satisfaction & quality in health care (16.3.2016) dr.nyunt nyunt waiPatient satisfaction & quality in health care (16.3.2016) dr.nyunt nyunt wai
Patient satisfaction & quality in health care (16.3.2016) dr.nyunt nyunt wai
 
Organising
OrganisingOrganising
Organising
 
Nscbl slide
Nscbl slideNscbl slide
Nscbl slide
 
Introduction to hahm 2017
Introduction to hahm 2017Introduction to hahm 2017
Introduction to hahm 2017
 
Hss lecture 2016 jan
Hss lecture 2016 janHss lecture 2016 jan
Hss lecture 2016 jan
 
Hospital management17
Hospital management17Hospital management17
Hospital management17
 
Hopital stat
Hopital statHopital stat
Hopital stat
 
Health planning approaches hahm 17
Health planning approaches hahm 17Health planning approaches hahm 17
Health planning approaches hahm 17
 
Ephs and nhp
Ephs and nhpEphs and nhp
Ephs and nhp
 
Directing and leading 2017
Directing and leading 2017Directing and leading 2017
Directing and leading 2017
 
Concepts of em
Concepts of emConcepts of em
Concepts of em
 
Access to medicines p pt 17 10-2015
Access to medicines p pt 17 10-2015Access to medicines p pt 17 10-2015
Access to medicines p pt 17 10-2015
 
The dynamics of disease transmission
The dynamics of disease transmissionThe dynamics of disease transmission
The dynamics of disease transmission
 
Study designs dr.wah
Study designs dr.wahStudy designs dr.wah
Study designs dr.wah
 
Standardization dr.wah
Standardization dr.wahStandardization dr.wah
Standardization dr.wah
 
Sdg
SdgSdg
Sdg
 

Último

Mangalore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Mangalore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetMangalore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Mangalore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
neemuch Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
neemuch Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetneemuch Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
neemuch Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
Call Girls in Udaipur Girija Udaipur Call Girl ✔ VQRWTO ❤️ 100% offer with...
Call Girls in Udaipur  Girija  Udaipur Call Girl  ✔ VQRWTO ❤️ 100% offer with...Call Girls in Udaipur  Girija  Udaipur Call Girl  ✔ VQRWTO ❤️ 100% offer with...
Call Girls in Udaipur Girija Udaipur Call Girl ✔ VQRWTO ❤️ 100% offer with...
mahaiklolahd
 
coimbatore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
coimbatore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetcoimbatore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
coimbatore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
Muzaffarpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Muzaffarpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetMuzaffarpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Muzaffarpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
Mathura Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Mathura Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetMathura Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Mathura Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
Jalna Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Jalna Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetJalna Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Jalna Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
VIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near Me
VIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near MeVIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near Me
VIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near Me
mriyagarg453
 
ooty Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
ooty Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetooty Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
ooty Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...
Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...
Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...
mahaiklolahd
 
dhanbad Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
dhanbad Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetdhanbad Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
dhanbad Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
Sambalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Sambalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetSambalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Sambalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
bhopal Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
bhopal Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetbhopal Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
bhopal Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 

Último (20)

Mangalore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Mangalore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetMangalore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Mangalore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
neemuch Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
neemuch Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetneemuch Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
neemuch Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
Call Girls Patiala Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Patiala Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Patiala Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Patiala Just Call 8250077686 Top Class Call Girl Service Available
 
(Deeksha) 💓 9920725232 💓High Profile Call Girls Navi Mumbai You Can Get The S...
(Deeksha) 💓 9920725232 💓High Profile Call Girls Navi Mumbai You Can Get The S...(Deeksha) 💓 9920725232 💓High Profile Call Girls Navi Mumbai You Can Get The S...
(Deeksha) 💓 9920725232 💓High Profile Call Girls Navi Mumbai You Can Get The S...
 
Call Girls in Udaipur Girija Udaipur Call Girl ✔ VQRWTO ❤️ 100% offer with...
Call Girls in Udaipur  Girija  Udaipur Call Girl  ✔ VQRWTO ❤️ 100% offer with...Call Girls in Udaipur  Girija  Udaipur Call Girl  ✔ VQRWTO ❤️ 100% offer with...
Call Girls in Udaipur Girija Udaipur Call Girl ✔ VQRWTO ❤️ 100% offer with...
 
Kolkata Call Girls Miss Inaaya ❤️ at @30% discount Everyday Call girl
Kolkata Call Girls Miss Inaaya ❤️ at @30% discount Everyday Call girlKolkata Call Girls Miss Inaaya ❤️ at @30% discount Everyday Call girl
Kolkata Call Girls Miss Inaaya ❤️ at @30% discount Everyday Call girl
 
coimbatore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
coimbatore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetcoimbatore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
coimbatore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
Muzaffarpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Muzaffarpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetMuzaffarpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Muzaffarpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
Krishnagiri call girls Tamil aunty 7877702510
Krishnagiri call girls Tamil aunty 7877702510Krishnagiri call girls Tamil aunty 7877702510
Krishnagiri call girls Tamil aunty 7877702510
 
Mathura Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Mathura Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetMathura Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Mathura Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
Call Girls Hyderabad Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Hyderabad Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Hyderabad Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Hyderabad Just Call 9907093804 Top Class Call Girl Service Available
 
Jalna Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Jalna Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetJalna Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Jalna Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
VIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near Me
VIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near MeVIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near Me
VIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near Me
 
ooty Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
ooty Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetooty Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
ooty Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...
Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...
Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...
 
Escorts Service Ahmedabad🌹6367187148 🌹 No Need For Advance Payments
Escorts Service Ahmedabad🌹6367187148 🌹 No Need For Advance PaymentsEscorts Service Ahmedabad🌹6367187148 🌹 No Need For Advance Payments
Escorts Service Ahmedabad🌹6367187148 🌹 No Need For Advance Payments
 
Call Girls Thane Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Thane Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Thane Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Thane Just Call 9907093804 Top Class Call Girl Service Available
 
dhanbad Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
dhanbad Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetdhanbad Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
dhanbad Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
Sambalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Sambalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetSambalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Sambalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
bhopal Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
bhopal Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetbhopal Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
bhopal Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 

Estimation

  • 1. Estimation Prof Hla Hla Win Head, Biostatistics Department, UPH Head, PSM Department, UM 2
  • 2. Objectives • Central Limit Theorem • Confidence Interval Estimation of the Mean (σ known) • Interpretation of the Confidence Interval • Confidence Interval Estimation of the Mean (σ unknown) • Confidence Interval Estimation for the Proportion • Determining Sample Size
  • 4. CENTRAL LIMIT THEOREM  When drawing several number of samples from a population randomly and by replacement method, each sample mean can be calculated from each sample  Each sample mean may differ some extent to population mean and in the same way for the variance and standard deviation  Population itself and all the samples taken from it may of may not be Normally distributed  When all the sample means are distributed as a data set, this data set is Normally distributed *(1)
  • 5.  The mean calculated from this data set (mean of the means) is exactly equal to the population mean *(2)  The standard deviation calculated from this data set is called the standard error (S.E) which measure the deviation of a sample mean from the population mean *(3)  S.E can be calculated by; S.E = or  S.E is always smaller than (s) s √n δ √n
  • 6. Population µ δ2 δ N Distribute as a data set Normal distribution s = S.E Sample 1 x s2 s N Sample k x s2 s N Sample 2 x s2 s N Sample 3 x s2 s N Sample . . x s2 s N x1 x2 x3 x… xk x = µ
  • 7. Central Limit Theorem Irrespective of the shape of the underlying distribution of the population, by increasing the sample size, sample means & proportions will approximate normal distributions if the sample sizes are sufficiently large.
  • 9. How large must a sample be for the Central Limit theorem to apply? The sample size varies according to the shape of the population. However, for our use, a sample size of 30 or larger will suffice.
  • 10. Must sample sizes be 30 or larger for populations that are normally distributed? No. If the population is normally distributed, the sample means are normally distributed for sample sizes as small as n=1.
  • 11. How can I tell the shape of the underlying population? • CHECK FOR NORMALITY: • Use descriptive statistics. Construct stem-and-leaf plots for small or moderate-sized data sets and frequency distributions and histograms for large data sets. • Compute measures of central tendency (mean and median) and compare with the theoretical and practical properties of the normal distribution. Compute the interquartile range. Does it approximate the 1.33 times the standard deviation? • How are the observations in the data set distributed? Do approximately two thirds of the observations lie between the mean and plus or minus 1 standard deviation? Do approximately four-fifths of the observations lie between the mean and plus or minus 1.28 standard deviations? Do approximately 19 out of every 20 observations lie between the mean and plus or minus 2 standard deviations?
  • 12. Estimation Confidence interval (C.I)  Once the sample has been drawn, it can be used to estimate characteristics of the underlying population  Because estimates vary from sample to sample, it is important to know how close the estimate derived from any one sample is likely to be to the underlying population value  One way is to construct a confidence interval around the estimate  * C.I = a range of values surrounding the estimate which has a specified probability of including the true population value
  • 13.  Specified probability is called the “confidence level” and the end-points of the interval are called the “confidence limits”  * confidence level (C.L) is set in term of “Z” value * 1 for 68.26% * 2 for 95.46% * 3 for 99.74% * 1.96 for 95% * Lower limit = x – (C.L X S.E) (or) x – (C.L X ) * Upper limit = x + (C.L X S.E) (or) x + (C.L X ) δ √n δ √n
  • 14.  Confidence interval depends upon: * the confidence level specified * the sample size  The larger the C.L, the larger the C.I  The smaller the C.L, the smaller the C.I  The larger the (n), the smaller the C.I  The smaller the (n), the larger the C.I
  • 15. “t” Distribution  According to central limit theorem, sampling distribution of means is Normal distribution  When sample sizes are not sufficiently large, sampling distribution of means is different from Normal distribution  t-distribution is like the Normal distribution, but it is somewhat more widely spread than the Normal curve “Z” distribution “t” distribution
  • 16.  “Z” value for “x” = = if 1 = 0.6826 are not = if 2 = 0.9546 concerned to = if 3 = 0.9974 sample size  “t” value for “x” = = 1 [if n = A] = a different = 1 [if n = B] = b value for = 1 [if n = C] = c different n  In practice, t-distribution is used when the population standard deviation (δ) is not known to the researcher and we use the estimated value from the sample (s), in the t- distribution x - µ δ x - x s
  • 17. Uses of t-distribution  Estimation of population mean  Significance testing of sample to population  Significance testing of unpaired two samples  Significance testing of paired two samples Estimation of population mean  It is done by calculating the “C.I”  Lower limit = x – {C.L (for respective n) X }  Upper limit = x + {C.L (for respective n) X }  C.L for respective n is shown in t-table s √n s √n
  • 18. How can I tell the shape of the underlying population? • CHECK FOR NORMALITY: • Use descriptive statistics. Construct stem-and-leaf plots for small or moderate-sized data sets and frequency distributions and histograms for large data sets. • Compute measures of central tendency (mean and median) and compare with the theoretical and practical properties of the normal distribution. Compute the interquartile range. Does it approximate the 1.33 times the standard deviation? • How are the observations in the data set distributed? Do approximately two thirds of the observations lie between the mean and plus or minus 1 standard deviation? Do approximately four-fifths of the observations lie between the mean and plus or minus 1.28 standard deviations? Do approximately 19 out of every 20 observations lie between the mean and plus or minus 2 standard deviations?
  • 19. Because I want to use Z scores to analyze sample means. But to use Z scores, the data must be normally distributed. That’s where the Central Limit Theorem steps in. Recall that the Central Limit Theorem states that sample means are normally distributed regardless of the shape of the underlying population if the sample size is sufficiently large.
  • 20. Recall from • Z = (X - µ) ÷ σ • If sample means are normally distributed, the Z score formula applied to sample means would be: • Z = [X-bar - µX-bar ] ÷ σ X-bar
  • 21. Background • To determine µX-bar, we would need to randomly draw out all possible samples of the given size from the population, compute the sample means, and average them. This task is unrealistic. Fortunately, µX-bar equals the population mean µ, which is easier to access. • Likewise, computing the value of σX-bar, we would have to take all possible samples of a given size from a population, compute the sample means, and determine the standard deviation of sample means. This task is also unrealistic. Fortunately, σX-bar can be computed by using the population standard deviation divided by the square root of the sample size.
  • 22. Note: As the sample size increases, the standard deviation of the sample means becomes smaller and smaller because the population standard deviation is being divided by larger and larger values of the square root of n.
  • 23. The ultimate benefit of the central limit theorem is a useful version of the Z formula for sample means.
  • 24. Z Formula for Sample Means: Z = [X-bar - µ] ÷ σ / √ n
  • 25. Example: The mean expenditure per customer at a tire store is $85.00, with a standard deviation of $9.00. If a random sample of 40 customers is taken, what is the probability that the sample average expenditure per customer for this sample will be $87.00 or more?
  • 26. Because the sample size is greater than 30, the central limit theorem says the sample means are normally distributed. Z = [X-bar - µ] ÷ σ / √ n Z = [$87.00 - $85.00] ÷ $9.00 / √ 40 Z = $2.00 / $1.42 = 1.41
  • 27. For Z = 1.41 in the Z distribution table, the probability is .4207. This represents the probability of getting a mean between $87.00 and the population mean $85.00. Solving for the tail of the distribution yields .5000 - .4207 = .0793 • This is the probability of X-bar ≥ $87.00.
  • 28. Interpretations Therefore, 7.93% of the time, a random sample of 40 customers from this population will yield a mean expenditure of $87.00 or more. OR From any random sample of 40 customers, 7.93% of them will spend on average $87.00 or more.
  • 29. Interpretations Therefore, 7.93% of the time, a random sample of 40 customers from this population will yield a mean expenditure of $87.00 or more. From any random sample of 40 customers, 7.93% of them will spend on average $87.00 or more.
  • 30. Solve: Suppose that during any hour in a large department store, the average number of shoppers is 448, with a standard deviation of 21 shoppers. What is the probability that a random sample of 49 different shopping hours will yield a sample mean between 441 and 446 shoppers?
  • 33. Via sample data, we can estimate something about our population, such as its average value µ, by using the corresponding sample mean, X-bar.
  • 34. Recall that µ, the population mean to be estimated, is a parameter, while X-bar, the sample mean, is a statistic.
  • 35. 10.35 Estimation… •The objective of estimation is to determine the approximate value of a population parameter on the basis of a sample statistic. •There are two types of estimators: •Point Estimator •Interval Estimator
  • 36. Point Estimate A point estimate is a statistic taken from a sample and is used to estimate a population parameter. However, a point estimate is only as good as the sample it represents. If other random samples are taken from the population, the point estimates derived from those samples are likely to vary. Because of variation in sample statistics, estimating a population parameter with a confidence interval is often preferable to using a point estimate.
  • 37. Confidence Interval A confidence interval is a range of values within which it is estimated with some confidence the population parameter lies. Confidence intervals can be one or two- tailed.
  • 38. Confidence Interval to Estimate µ • By rearranging the Z formula for sample means, a confidence interval formula is constructed: • X-bar +/- Z α/2 σ / √ n • Where: • α = the area under the normal curve outside the confidence interval • α/2 = the area in one-tail of the distribution outside the confidence interval
  • 39. The confidence interval formula yields a range (interval) within which we feel with some confidence the population mean is located. It is not certain that the population mean is in the interval unless we have a 100% confidence interval that is infinitely wide, so wide that it is meaningless.
  • 40. Confidence interval estimates for five different samples of n=25, taken from a population where µ=368 and σ=15
  • 41. Common levels of confidence intervals used by analysts are 90%, 95%, 98%, and 99%.
  • 42. 95% Confidence Interval • For 95% confidence, α = .05 and α / 2 = .025. The value of Z.025 is found by looking in the standard normal table under .5000 - .025 = . 4750. This area in the table is associated with a Z value of 1.96. • An alternate method: multiply the confidence interval, 95% by ½ (since the distribution is symmetric and the intervals are equal on each side of the population mean. • (½) (95%) = .4750 (the area on each side of the mean) has a corresponding Z value of 1.96.
  • 43. In other words, of all the possible X-bar values along the horizontal axis of the normal distribution curve, 95% of them should be within a Z score of 1.96 from the mean.
  • 44. Margin of Error Z [σ / √ n]
  • 45. Example: • A business analyst for cellular telephone company takes a random sample of 85 bills for a recent month and from these bills computes a sample mean of 153 minutes. If the company uses the sample mean of 153 minutes as an estimate for the population mean, then the sample mean is being used as a POINT ESTIMATE. Past history and similar studies indicate that the population standard deviation is 46 minutes. • The value of Z is decided by the level of confidence desired. A confidence level of 95% has been selected.
  • 46. 153 + /- 1.96( 46/ √ 85) = 143.22 ≤ µ ≤ 162.78 • The confidence interval is constructed from the point estimate, 153 minutes, and the margin of error of this estimate, + / - 9.78 minutes. • The resulting confidence interval is 143.22 ≤ µ ≤ 162.78. • The cellular telephone company business analyst is 95% confident that the average length of a call for the population is between 143.22 and 162.78 minutes.
  • 47. Interpreting a Confidence Interval • For the previous 95% confidence interval, the following conclusions are valid: • I am 95% confident that the average length of a call for the population µ, lies between 143.22 and 162.78 minutes. • If I repeatedly obtained samples of size 85, then 95% of the resulting confidence intervals would contain µ and 5% would not. QUESTION: Does this confidence interval [143.22 to 162.78] contain µ? ANSWER: I don’t know. All I can say is that this procedure leads to an interval containing µ 95% of the time. • I am 95% confident that my estimate of µ [namely 153 minutes] is within 9.78 minutes of the actual value of µ. RECALL: 9.78 is the margin of error.
  • 48. Confidence Interval Estimation of the Mean (σ Unknown) In reality, the actual standard deviation of the population, σ, is usually unknown. Therefore, we use “s” (sample standard deviation) to compute the confidence interval for the population mean, µ. However, by using “s” in place of σ, the standard normal Z distribution no longer applies. Fortunately, the t-distribution will work, provided the population we obtain the sample is normally distributed.
  • 49. Be Careful! The following statement is NOT true: “The probability that µ lies between 143.22 and 162.78 is .95.” Once you have inserted your sample results into the confidence interval formula, the word PROBABILITY can no longer be used to describe the resulting confidence interval.
  • 50. Assumptions necessary to use t- distribution • Assumes random variable x is normally distributed • However, if sample size is large enough ( > 30), t-distribution can be used when σ is unknown. • But if sample size is small, evaluate the shape of the sample data using a histogram or stem- and-leaf. • As the sample size increases, the t-distribution approaches the Z distribution.
  • 51. Confidence Interval using a t-distribution X-bar +/- t α,n-1 [s / √ n α= confidence interval n-1 = degrees of freedom
  • 52. Example: • As a consultant I have been employed to estimate the average amount of comp time accumulated per week for managers in the aerospace industry. • I randomly sample 18 managers and measure the amount of extra time they work during a specific week and obtain the following results (in hours). Assume a 90% confidence interval. • AEROSPACE DATA 6 21 17 20 7 0 8 16 29 3 8 12 11 9 21 25 15 16
  • 53. Solution: To construct a 90% confidence interval to estimate the average amount of extra time per week worked by a manager in the aerospace industry, I assume that comp time is normally distributed in the population. The sample size is 18, so df = 17. A 90% level of confidence results in an α / 2 = .05 area in each tail. The table t-value is t .05,17 = 1.740.
  • 54. With a sample mean of 13.56 hours, and a sample standard deviation of 7.8 hours, the confidence interval is computed: X-bar +/- t α/2, n-1 S / √ n =13.56 +/- 1.740 ( 7.8 / √ 18) = 13.56 +/- 3.20 = 10.36 ≤ µ ≤ 16.76
  • 55. Interpretation: The point estimate for this problem is 13.56 hours, with an error of +/- 3.20 hours. I am 90% confident that the average amount of comp time accumulated by a manager per week in this industry is between 10.36 and 16.76 hours.
  • 56. Recommendations: From these figures, the aerospace industry could attempt to build a reward system for such extra work or evaluate the regular 40-hour week to determine how to use the normal work hours more effectively and thus reduce comp time.
  • 57. Solve: I own a large equipment rental company and I want to make a quick estimate of the average number of days a piece of ditch digging equipment is rented out per person per time. The company has records of all rentals, but the amount of time required to conduct an audit of all accounts would be prohibitive. I decide to take a random sample of rental invoices. Fourteen different rentals of ditch diggers are selected randomly from the files. Use the following data to construct a 99% confidence interval to estimate the average number of days that a ditch digger is rented and assume that the number of days per rental is normally distributed in the population.
  • 58. Ditch Digger Data: 3 1 3 2 5 1 2 1 4 2 1 3 1 1
  • 60. Estimating the Population Proportion For most businesses, estimating market share (their proportion of the market) is important b/c many company decisions evolve from market share information: • What proportion of my customers pay late? • What proportion don’t pay at all? • What proportion of the produced goods are defective? • What proportion of the population has cats/ dogs/ horses/ kids/ exercises/ reads?
  • 61. Confidence Interval Estimate for the Proportion • ps +/- Z√ ps(1-ps) / n • ps - Z√ps(1-ps) /n ≤ p ≤ ps + Z√ps(1-ps) /n • ps = sample proportion = X / n = number of successes ÷ sample size. This is the POINT ESTIMATE. • p = population proportion • Z = critical value from the standardized normal distribution • n = sample size
  • 62. ps +/- Z√ ps(1-ps) / n NOTE: This formula can be applied only when np and n(1-p) are at least 5.
  • 63. Example: A study of 87 randomly selected companies with a telemarketing operation revealed that 39% of the sampled companies had used telemarketing to assist them in order processing. Using this information, how could a researcher estimate the population proportion of telemarketing companies that use their telemarketing operation to assist them in order processing?
  • 64. Solution: • The sample proportion = .39. • This is the point estimate of the population proportion, p. • The Z value for 95% confidence is 1.96. • The value of (1-p) = 1 - .39 = .61.
  • 65. ps +/- Z√ ps(1-ps) / n ps - Z√ps(1-ps) /n ≤ p ≤ ps + Z√ps(1-ps) /n • The confidence interval estimate is: .39 – 1.96√(.39) (.61) / 87 ≤ p ≤ .39 + 1.96√(.39) (.61) / 87 .39 - .10 ≤ p ≤ .39 + .10 .29 ≤ p ≤ .49
  • 66. Interpretation: We are 95% confident that the population proportion of telemarketing firms that use their operation to assist order processing is somewhere between .29 and .49. There is a point estimate of .39 with a margin of error of +/- .10.
  • 67. Solve: A clothing company produces men’s jeans. The jeans are made and sold with either a regular cut or a boot cut. In an effort to estimate the proportion of their men’s jeans market in Oklahoma City that is for boot-cut jeans, the analyst takes a random sample of 212 jeans sales from the company’s two Oklahoma City retail outlets. Only 34 of the sales were for boot-cut jeans. Construct a 90% confidence interval to estimate the proportion of the population in Oklahoma City who prefer boot-cut jeans.
  • 68. Solution: ps = 34/212 = .16 A point estimate for boot-cut jeans is .16 or 16%. The Z value for 90% level of confidence is 1.645. The confidence interval estimate is: ps - Z√ps(1-ps) /n ≤ p ≤ ps + Z√ps(1-ps) /n .16 – 1.645√(.16) (.84) / 212 ≤ p ≤ .16 + 1.645√(.16) (.84) / 212 .16 - .04 ≤ P ≤ .16 + .04 .12 ≤ P ≤ .20 We are 90% confident that the proportion of boot-cut jeans is between 12 and 20 %.
  • 69. Estimating Sample Size The amount of sampling error you are willing to accept and the level of confidence desired, determines the size of your sample.
  • 70. Sample size when Estimating µ n = Z2 σ2 / e2 e = Z (σ / √ n
  • 71. To determine sample size: • Know the desired confidence level, which determines the value of Z (the critical value from the standardized normal distribution. Determining the confidence level is subjective. • Know the acceptable sampling error, e. The amount of error that can be tolerated. • Know the standard deviation, σ. If unknown, estimate by: • past data • educated guess • estimate σ: [σ = range/4] This estimate is derived from the empirical rule stating that approximately 95% of the values in a normal distribution are within +/- 2σ of the mean, giving a range within which most of the values are located.
  • 72. Example: Suppose the marketing manager wishes to estimate the population mean annual usage of home heating oil to within +/- 50 gallons of the true value, and he wants to be 95% confident of correctly estimating the true mean. On the basis of a study taken the previous year, he believes that the standard deviation can be estimated as 325 gallons. Find the sample size needed.
  • 73. Solution: • With e =50, σ = 325, and 95% confidence (Z = 1.96) • n = Z2 σ2 /e2 = (1.96)2 (325)2 / (50)2 • n = 162.31 • Therefore, n = 163. As a general rule for determining sample size, always round up to the next integer value in order to slightly over satisfy the criteria desired.
  • 74. Solve: Suppose you want to estimate the average age of all Boeing 727 airplanes now in active domestic U.S. service. You want to be 95% confident, and you want your estimate to be within 2 years of the actual figure. The 727 was first placed in service about 30 years ago, but you believe that no active 727s in the U.S. domestic fleet are more than 25 years old. How large a sample should you take?
  • 75. Solution: With E = 2 years, & Z value for 95% = 1.96, and σ unknown, it must be estimated by using σ ≈ range ÷ 4. As the range of ages is 0 to 25 years, σ = 25 ÷ 4 = 6.25.
  • 76. n = Z2 σ2 /e2 n = Z2 σ2 /e2 = (1.96)2 (6.25)2 / (2)2 = 37.52 airplanes. Because you cannot sample 37.52 units, the required sample size is 38. If you randomly sample 38 planes, you can estimate the average age of active 727s within 2 years and be 95% confident of the results.
  • 77. Solve: Determine the sample size necessary to estimate µ when values range from 80 to 500, error is to be within 10, and the confidence level is 90 %. n = Z2 σ2 /e2 Answer: 200
  • 78. Determining sample size for proportion n = Z2 p(1-p) /e2 • p = population proportion (if unknown, analysts use .5 as an estimate of p in the formula) • e = error of estimation equal to (ps – p) the difference between the sample proportion and the parameter to be estimated, p. Represents amount of error willing to tolerate.
  • 79. Solve: The Packer, a produce industry trade publication, wants to survey Americans and ask whether they are eating more fresh fruits and vegetables than they did 1 year ago. The organization wants to be 90% confident in its results and maintain an error within .05. How large a sample should it take?