The document discusses standard error and standard deviation. It defines parameters as describing populations using Greek letters, while statistics describe samples using non-Greek letters. Standard error estimates the standard deviation of a statistic and is used to calculate confidence intervals and margin of error. It relies on sample statistics rather than population parameters. The central limit theorem states that for a large enough sample size from any population, the sampling distribution of the sample mean will be approximately normally distributed.
2. Parameters and Statistics
• A parameter is a number that describes the population. A parameter
always exists but in practice we rarely know its value because of the
difficulty in creating a census. Parameters always use Greek letters to
describe them. For instance we know that represents the mean of a
population and represents the standard deviation of the population. If
we are talking about a percentage parameter, we use the Greek letter
(rho).
• Example: If we wanted to compare the IQ’s of all American and Asian
males, it would be impossible.
But it is important to realize that µ American male and µ Asian male exist.
• Example: If we were interested in whether there is a greater percentage of
women who eat broccoli than men, we want to know whether ρ women > ρ
men
3.
4. Sampling Distribution
The Sampling Distribution of a statistic is the distribution of values taken by the
statistic of all possible samples of the same size from the population. When we
sample, we sample with replacement meaning that the same value can be used
over again. A sampling distribution is a sample space: it describes everything
that can happen when we sample.
5. Standard Error
• Estimate of the standard deviation of a statistic
• Used to compute other measures like confidence intervals and margin of error
• Parameter associated with population (uses Greek symbols)
• Statistic associated with sample of population (always uses non-Greek symbols)
Notation
Population Parameter Sample Statistic
N: # of observations in population n: # of observations in sample
P: proportion of success in population p: proportion of success in sample
μ: population mean x: sample estimate of population mean
σ : population standard deviation s: sample estimate of σ
σp : standard deviation of proportion SE p: standard error of proportion
σ x : standard deviation of x SE x: standard error of x
6. Standard Deviation of Sample
Estimates
• Use sample statistics to estimate population parameters
• Variability of statistic is measured by it’s Standard Deviation
• Formulas below are valid when the population size is at least 10 times larger
than the sample size
Statistic Standard Deviation
sample mean (x) (found under single-sample mean in table)
Sample proportion (p) (found under single sample proportion in table)
Difference between means (x1-x2) (found under two sample means in table)
Difference between proportions (p1-p2) (found under two Sample
proportions in table)
7. Standard Error of Sample Estimates
• When population parameter is unknown, you cannot compute standard deviation of
statistic, therefore you must compute the standard error (this is usually the case)
• Standard Error (SE) relies on sample statistics and provides an unbiased estimate the
standard deviation of the statistic
• Use the table below only if the sample is a simple random sample and the population
size is at least 10 times larger than the sample size
• NOTE: the equations are identical to standard deviation, except Standard Error uses sample statistics (p, s)
where the standard deviation uses population parameters (P, σ)
Statistic Standard Error
Sample Mean (x )
Sample Proportion (p)
Difference between means (x1 – x2)
Difference between proportions (p1- p2)
10. What the CLT says in Words:
a) You start with some population with some mean and standard deviation . You may know the mean and
standard deviation but most likely you do not. The distribution may be normal but it does not have to
be.
Example: A pizza shop sells slices of pizza for $1.75 and sodas for $.75. People come in for lunch and pay
various amounts. Some people just buy a soda. Some buy only a slice, others buy a slice and a soda. Some get
2 slices – others buy lunch for a friend and thus spend more. We have no idea what the distribution of prices
looks like.
b) Decide a sample size – call it n. Start taking samples. Find the mean of your sample.
Example: Suppose n = 10. Take a random sample of 10 people at the pizza place, calculate their bills, and find
the average and standard deviation.
c) Now take a lot of samples of size n and find the average of the averages you just found.
Example: Let’s take 500 of these samples of 10 and find the average of the average bill.
d) The CLT says three things:
1) that the mean of the population (what we want to find) will be the same as the mean of your samples.
2) the standard deviation of the samples will be the population standard deviation divided by square root of n .
3) the histogram of the samples will appear normal (bell shaped). The larger the sample size (n), the
smaller the standard deviation will be and the more constricted the graph will be.