SlideShare uma empresa Scribd logo
1 de 42
Lecture Series on
  Biostatistics




      Inferential Statistics-
           Estimation
                           By
           Dr. Bijaya Bhusan Nanda,
       M. Sc (Gold Medalist) Ph. D. (Stat.)
    Topper Orissa Statistics & Economics Services, 1988
            bijayabnanda@yahoo.com
CONTENTS
 Introduction
 Confidential interval for a population mean
 The t distribution
 Confidence interval for the difference
  between two population mean
 Confidence interval for a population
  proportion
 Confidence interval for the difference
  between two population proportion
Introduction
Statistical Inference
 It is the procedure by which we reach a conclusion
  about a population on the basis of information
  contained in the sample drawn from that population.
 Two broad areas of statistical inference
    Estimation
    Hypothesis testing.
 Estimation- It is the process of calculating some
  statistics based upon the sample data drawn from a
  certain population. The statistics is used as an
  approximation to the population parameter.
Types of Estimate
 For each of parameters, we can have
   Point Estimate
   Interval estimate
 A point estimate is a single numerical value used to
 estimate the corresponding population parameter.
An interval estimate consists of two numerical values
 defining a range of values that, with a specified degree
 of confidence, includes the parameter being estimated.
Choosing an Appropriate Estimation
 A single computed value is an estimate.
 An estimator usually presented as a formula. For
  example
   x=
        ∑ xi
         n   is the estimator of population mean
 The single numerical value that results from
  evaluating this formula is called an estimate of the
  parameter µ.
 E(T) is obtained by taking the average value of T
  computed from all possible sample i.e. E(T)= µT
Criteria of a good estimator:
There can be more than one estimator for the parameter.
 For example population mean can be estimated by the
 sample median or sample mean.
But a good estimator should fulfill certain criteria. One such
 criterion of a good estimator is unbiasedness.
An estimator T of the parameter θ is said to be an unbiased
 estimator of θ if E(T)= θ
 Sample mean is an unbiased estimator of population mean.
 Sample proportion is an unbiased estimate of population
 proportion.
 The difference between two sample means is an unbiased
 estimate of difference between the population mean.
The difference between two sample proportions is an
 unbiased estimate of difference between the population
 proportion.
Sampled Population and Target Population
 Sampled population is the population from which one
  actually draws a sample.
 The target population is the population about which
  one wishes to make inference.
 Statistical inference procedure allows one to make
  inference about sampled population (provided proper
  sampling methods have been employed)
 Only when sampled population and the target
  population are the same, it is possible to reach
  statistical inference about the target population.
Random and Non random Samples
 The strict validity of the statistical procedures
  discussed depends on the assumptions of random
  sample.
 In real world applications it is impossible or
  impractical to use truly random samples.
 If the researchers had to depend on randomly selected
  materials, very little research of this type would be
  conducted.
 Therefore, non statistical considerations must play a
  past in the generalizations process.
 Researchers may contend that samples actually used
  are equivalent to simple random samples, since there is
  no reasons to believe that material actually used is not
  representative of the population about which inferences
  are desired.
 In many health research projects, samples of
  convenience, rather than random samples are
  employed.
 Generalization must be made on non statistical
  consideration.
 Consequences of such generalization, however may
  range from misleading to disastrous.
 In some situation it is possible to introduce
  randomization into experiment even though available
  subjects are not randomly selected from well defined
  population.
 Example: Allocating subjects to treatments in a random
  manner.
Confidential interval for a population mean
 Say x the sample mean from a random sample of size n
      is
  drawn from a normally distributed population.
 x is an unbiased estimator of the population mean
      ()
     E x =µ
 But because of sampling fluctuation x can’t be expected
  to be equal to µ.
 Therefore, we should have an interval estimate µ.
 For the interval estimate we must have knowledge of
  sampling distribution of x .
 Sampling distribution of x for a normal population is as
  follows

         x ≈ N ( µ , σ )i.e.N ( µ ,σ            )
                                            n
                     x   x
 Following the normal probability distribution the
 interval estimate for µ is as follows
 95% confidence interval for µ = µ ± 2σ          x
 since we don’t know µ with the help of sample mean   x   we
 have the 95% confidence interval for µ as
              x ± 2σx = x ± 2 σ             n
  If we don’t know σ the 95% confidence interval for
  µ = x ± 2 s / n where s= the standard deviation based
   on the sample.
Interval estimate Component
 In general an interval estimate may be expressed as
  follows
  estimator ± (reliability coefficient)× (standard error)
 In particular when sampling is from a normal
  distribution with known variance, an interval
  estimator for µ may be expressed as
                  x±z                 σ
                           (1 − α / 2 )   x
  where z(1-α /2) is the value of z to the left of which lies
  1- α /2 and to the right of which lies α /2 of the area
  under its curve.
Interpreting Confidence Interval:
 Probabilistic Interpretation: In repeated sampling,
  from a normally distributed population with a known
  standard deviation, 100(1- α ) percent of all intervals of
  the form   x±z               σ
                     (1 − α / 2 ) x will in the long run
  include the population mean µ .
 Practical Interpretation: When sampling is from a
  normally distributed population with known standard
  deviation, we are 100(1- α ) % confident that the single
  computed interval,
  population mean µ .
                         x±z                σ
                                    (1 − α / 2 ) x
                                                   , contains the

 Precision: The quantity obtained by multiplying the
  reliability factor by the standard error of the mean is
  called the precision of the estimate. This quantity is also
  called the margin of error.
Example
       A physical therapist wished to estimate, with 99 percent
  confidence , the mean maximum strength of a particular
  muscle in a certain group of individuals. He is willing to
  assume that strength scores are approximately normally
  distributed with a variance of 144. A sample of 15 subjects
  who participated in the experiment yielded a mean of 84.3.
Solution:
  The z value corresponding to a confidence coefficient of 0.99
  is found in Table D to be 2.58. This is our reliability
  coefficient. The standard error is σx = 12          = 3.0984
    .Our 99%                                      15
                                    84.3 ± 2.58(3.0984)
  Confidence interval for µ, then
                                    84.3 ± 8.0
                                    76.3,92.3
We say we are 99% confidence that the
 population mean is between 76.3 and 92.3
 since, in repeated sampling, 99% of all
 intervals that could be constructed in the
 manner just described would include the
 population mean.
The t distribution
 Estimation of confidence interval using standard normal z
  distribution applies when population variance is known.
  Usually, the knowledge of population variance is not
  known.
 This condition presents a problem to construct confidence
  intervals.
                             x −µ
  Although the statistic z = σ       is normally distributed
                                n
  or approximately normally distributed when n is large,
  regardless of the functional form of the population, we
  can’t make use of this fact because σ is not known.
 In this case we may use of
                               s=   ∑ ( xi − x )
                                                   2
                                                       ( n − 1)
  as an estimator of σ. When sample size is large, say,
  greater than 30, s is a good approximate of σ is
  substantial and we may use normal distribution theory
  to construct confidence interval.
 When the sample size is small, ≤30, we make use of
  Student’s t distribution.

                  x−µ
               t=
The quantity      s       follows this distribution.
                    n
Properties of the t distribution:
    It has a mean of 0.
    Symmetrical about the mean.
   In general, it has a variance greater than 1, but the
    variance approaches 1 as the sample size becomes
    large. For df >2, the variance for t distribution is
    df /(df-2).
    The variable t ranges from -∞ to + ∞ .
   The t distribution is really a family of distributions,
    since there is a different distribution for each sample
    value of n-1, the divisor used in computing s2. We recall
    that n-1 is referred to as degrees of freedom.
   Compared to normal distribution is less peaked at
    center and has higher tails t distribution approaches
    the normal distribution an n-1 approaches infinity.
   The t distribution approaches the normal distribution
    as n-1 approaches infinity.
The t distribution for different degrees of freedom
                           Degrees of freedom=30
                           Degrees of freedom=5
                           Degrees of freedom=2




Comparison of normal distribution and t distribution

                            Normal Distribution


                            t distribution
Confidence Intervals Using t:
 estimator ± (reliability coefficient) (standard error)
                                                     error
The source of reliability co-efficient is ‘t’ distribution
  rather than standard normal z distribution.
 When sampling is from a normal distribution whose
  standard deviation , σ , is unknown the 100(1-α )%
  confidence interval for the population mean µ , is
  given by
                                         s
                    x ± t (1 − α / 2 )
                                          n
   A moderate departures of the sampled population
    from the normally can be tolerated in using the ‘t’
    distribution. An assumption of mound shaped
    population distribution be tenable.
Example:
  Maureen McCauley conducted a study to evaluate the effect
  of on the job body mechanics instruction on the work
  performance of newly employed young workers (A-1). She
  used two randomly selected groups of subjects, an
  experimental group and a control group. The experimental
  group received one hour of back school training provided on
  occupational therapist. The control group did not receive this
  training. A criterion referenced Body Mechanics Evaluation
  Checklist was used to evaluate each workers lifting, lowering,
  pulling, and transferring of objects in the work environment.
  A correctly performed task received a score of 1. The 15
  control subjects made a mean score of 11.53 on the evaluation
  with a standard deviation of 3.681. We assume that these 15
  controls behave as a random sample from a population of
  similar subjects. We wish to use these sample data to estimate
  the mean score fro the population.
Solution:
  We may assume sample mean, 11.53, as a point estimate
  of population mean but since the population standard
  deviation is unknown, we must assume the population of
  values to be at least approximately normally distributed
  before constructing a confidence interval for µ .
  Let us assume an assumption is reasonable and that a
  95% confidence interval is desired. Our estimator x is
   and our standard error is

             s   n = 3.681         = 0.9564
                              15
  We need to find reliability coefficient, the value of t
  associated with a confidence coefficient of 0.95 and n-
  1=14 degrees of freedom.
Confidence interval is equally divided in to two tails
i.e.0.025 .In Table E the tabulated value of t is 2.1448, which
is our reliability coefficient.
Then our 95% confidence interval is as follows
                  11.53 ± 2.1448(0.9504)
                  11.53 ± 2.04
                  9.49,13.57
Deciding Between z and t:
               To make an appropriate choice between z and
we must consider whether the sampled population is normally
distributed, and whether the population variance is known.
Flowchart for use in deciding between z and t when
              making inferences about population means
                                      Normal
                Y                    Population                  N


               Large                                          Large
      Y       Sample         N                      Y        Sample         N




     Populn            Y    Populn     N     Y     Populn    N       Y     Populn   N
Y               N
      var                    var                    var                     var
    known?
                           known?                 known?                  known?


z                t     z               t      z              z        *             *

                z                          Central limit theorem applies

               * = use nonparametric Procedure
Confidence interval for the difference between two
                  population means
Normal Population
 Say µ -µ = the difference between two normal population
       1  2
  means.

   x −x
     1      2 = the difference between two independent
  sample means drawn from the two populations respectively.
            E ( x 1 − x 2) = µ1 − µ 2
                             σ1 σ 2
                              2
            V ( x 1 − x 2) = + 2
                             n1 n 2
  Where σ12 & σ22 are the variance of the population 1&2 and n1,
  n2 are the size of the SRS drawn from population 1&2
  respectively.
 When the population variances are known, the 100(1-
  α)% confidence interval for µ1& µ2 is given by

                                        σ           σ
                                            2           2

          (x − x ) ± z                          +
                                            1           2
             1     2     (1 − α / 2 )

                                        n   1       n   2



 When the confidence interval includes zero, we say
  that the population means may be equal when it
  doesn’t includes zero we say that the interval
  provides evidence that the two population means are
  not equal.
Example:
  A research team is interested in the difference between
  serum uric acid levels in patients with and without Down’s
  syndrome. In a large hospital for the treatment of the
  mentally retarded, a sample of 12 individuals with Down’s
  syndrome yielded a mean of 4.5mg/100ml. In a general
  hospital a sample of 15 normal individuals of the same age
  and sex were found to have a mean value of 3.4. If it is
  reasonable to assume that the two populations of values are
  normally distributed with variances equal to 1 and 1.5, find
  the 95% confidence interval for µ1- µ2.
Solution:
  For a point estimate of µ1- µ2, we use
               x 1 − x 2 = 4.5 − 3.4 = 1.1
The reliability coefficient corresponding to 0.95 is found in
Table D to be 1.96. The standard error
                 σ
                     2
                      σ     1 1 .5
                              2
σ ( x 1 − x 2) =    +1
                         =   +2
                                   = 0.4282
                 n1   n2   12 15
The 95 percent confidence interval, then, is
                  1.1 ±1.96(0.4282)
                  1.1 ± 0.84
                  0.26,1.94
The difference µ1- µ2, is some where between 0.26 and 1.94,
because, in repeated sampling, 95% of the intervals
constructed in this manner would include the difference
between the true means. Since the interval does not include
zero, we conclude that the two population means are not
equal.
Sampling from Nonnormal Population:
  If sample sizes n1 and n2 are large, we may construct the
  confidence interval the same way as we do incase of
  normal population in accordance with the Central Limit
  Theorem.
Example:
  Motivated by an awareness of the existence of a body of
  controversial literature suggesting that stress, anxiety, and
  depression are harmful to the immune system, Gorman et al.
  (A-5) conducted a study in which the subjects were homosexual
  men and some of whom were HIV positive and some of whom
  were HIV negative. Data were collected on a wide variety of
  medical, immunological, psychiatric, and neurological measures,
  one of which was the number of CD4+ cells in the blood. The
  mean number of CD4+ cells for the 112 men with HIV infection
  was 401.8 with a standard deviation of 226.4. For the 75 men
  without HIV infection the mean and standard deviation were
  828.2 and 274.9, respectively. We wish to construct a 99%
  confidence interval for the difference between population
  means.
Solution:
  Here we are using z statistic as the reliability factor in the
  construction of our confidence interval. Since the population
  standard deviation are not given, we will use the sample standard
  deviations to estimate them. The point estimate for the difference
  between population means is the difference between sample
  means, 882.2-401.8=426.4. In Table D, we find the reliability
  factor to be 2.58. The estimated standard error is

                            2               2
                   274.9            226.4
   sx − x =                     +               = 38.279
                      75            112
      1     2
Our 99% confidence interval for the difference between
 population means is
                 426.4 ± 2.58(38.279)
                 327.6,525.2
We are 99% confident that the mean number of CD4+
 cells in HIV +ve males differs from the mean for HIV
 –ve males by some where between 327.6 and 525.2.
Confidence interval using ‘t’ distribution:
 Say the variances of the two sampled population are not
   known.
 The two sampled population are normally distributed.
 The difference between the two population means can be
   estimated through confidence interval using ‘t’ distribution
   as the source of reliability factor.
Situation-I-Equal population variances:
   The 100(1-α)% confidence interval for µ1- µ2 is given by
                                                2           2
                                            s           s
            ( x 1 − x 2) ± t (1 − α / 2 )       p
                                                    +       p

                                            n1          n2
If the two sample sizes are unequal , the weighted average
   takes advantages of the additional information provided by
   the larger sample. The pooled estimate is given by
( n1 −1) s + ( n 2 −1) s
                                    2                       2

           s
               2
               p   =                1                       2

                                n1 + n 2 − 2
The standard error of the estimate , then, is given by

                                            2           2
                                        s           s
               Sx    1   − x2   =           p
                                                +       p

                                        n1          n2
The ‘t’ follows a Student’s t distribution with n1+n2-2 d.f.
Example:
       The purpose of a study by Stone et al.(A-6) was to
determine the effects of long-term exercise intervention on
corporate executives enrolled in a supervised fitness program.
Data were collected on 13 subjects (the exercise group) who
  voluntarily entered a supervised exercise program and remained
  active for an average of 13 years and 17 subjects (the secondary
  group) who elected not to join the fitness program. Among
  the data collected on the subjects was maximum number of
  sit-ups completed in 30 seconds. The exercise group had a
  mean and standard deviation for this variable of 21.0 and
  4.9 respectively. The mean and standard deviation for the
  secondary group were 12.1 and 5.6 respectively. We assume
  that the two population of overall muscle condition
  measures are approximately normally distributed and that
  the two population variances are equal. We wish to
  construct a 95% confidence interval for the difference
  between the means of the populations represented by these
  two samples.
Solution:
       The 95% confident that the difference between
  population means is somewhere between 4.9, 12.9 .
Situation-II-Unequal population variances:
   An approximate 100(1-α)% confidence interval for µ1- µ2 is
   given by                              2
                                       s1 s 2
                                                2
                 x 1 − x 2 ± t ′(1 − α / 2 )        +
                                               n1       n2
The solution proposed by Cochran consists of computing the
  reliability factor t ′  =
                            w1t1 + w2 t 2
                        (1−α 2)
                                   w1 + w2
Where w1=s12/n1, w2=s22/n2, t1=t1-α/2 for n1-1 degrees of freedom,
  and t2=t1- α/2 for n2-1 degrees of freedom.
Example:
  in the study by Stone et al.(A-6)described in previous
  example , the investigator also reported the following
  information on a measure of overall muscle condition
  scores made by the subjects:
Sample      n    Mean    S.D
    Exercise    13   4.5     0.3
    group
    Sedentary   17   3.7     1.0
    group


We assume that the two populations of overall muscle
condition scores are approximately normally distributed. We
are unwilling to assume, however, that the two population
variances are equal. We wish to construct a 95% confidence
interval for the difference between the mean overall muscle
condition scores of the two populations represented by the
samples.
Solution:
The 95% confidence interval for the difference between the
two population means when they are not equal is 0.25, 1.34
Flowchart in deciding whether the reliability factor should be z, t, or t’
when making inferences about the difference between two population
means                        Normal
                                Y                    Population                       N

             Y
                        Large                N                                         Large
                       Sample                                                Y                               N
                                                                                      Sample



          Populn                         Populn                          Populn                          Populn
    Y                 N             Y                  N            Y                 N             Y                 N
           var                             var                             var                             var
         known?                          known?                          known?                          known?



    =?                =?            =?                =?            =?                =?            =?                =?




Y        N       Y         N    Y        N       Y         N    Y        N       Y         N    Y        N        Y        N


z        z       t         t’   z        z       t         t’   z        z        z        t’   *        *        *        *
                 or        or       * = Nonparametric test to be applied
                  z         z
Confidence interval for a population proportion
    Population proportion in many situations is a matter of
    great interest to the health professional.
    What proportion of people suffers from different diseases,
    disability etc.? what proportion of patience response to a
    particular treatment? These are certain important questions
    for the health purpose.
    We estimate the population proportion as in the case of
    population mean.
    A random sample is drawn from the population of interest
    and the sample proportion p provides an unbiased estimate
                                  ˆ
    of the population proportion P.
   A confidence interval is obtained by the general formula
    estimator ± (reliability coefficient) × (standard error)
    when both np and n(1-p) are greater than 5, the sampling
    distribution of p is quite close to the normal distribution.
                     ˆ
   When above condition is mate100(1-α) % confidence
    interval for p is given by

                     p ± z (1 − α / 2 )
                     ˆ                    p(1 − p )
                                          ˆ     ˆ
Example:
                                                      n
  Mathers et al.(A-12) found that in a sample of 591 patients
  admitted to a psychiatric hospital, 204 admitted to using
  cannabis at least once in their lifetime. We wish to
  construct a 95% confidence interval for the proportion of
  lifetime cannabis users in the sampled population of
  psychiatric hospital admission.
Solution:
The 95% confident that the population proportion p is
  between 0.3069, 0.3835
Confidence interval for the difference between two
             population proportion
   An unbiased point estimator of the difference between two
   population proportions (p1-p2) is provided by the difference
                                 ˆ ˆ
   between sample proportion p1 − p 2 .
  When n1 and n2(sample size) are large and the population
   proportions are not too close to 0 and 1, normal distribution
   theory may be applied to obtain confidence interval.
  A 100(1-α)% confidence interval for p1-p2 is given by


                                    p1 (1 − p1)
                                    ˆ       ˆ         p 2 (1 − p 2 )
                                                      ˆ        ˆ
       p 2 − p 2 ± z (1 − α / 2 )
       ˆ     ˆ                                    +
                                        n1                 n2
Example:
  Borst et al. (A-16) investigated the relation of ego
  development, age, gender and diagnosis to suicidality among
  adolescent psychiatric inpatients. There sample consisted of 96
  boys and 123 girls between the age of 12 and 16 selected from
  admissions to a child and adolescent unit of a private
  psychiatric hospital. Suicide attempts were reported by 18 of
  the boys and 60 of the girls. Let us assume the girls and the
  boys behave like a simple random sample from a population
  of similar girls and that the boys likewise may be considered a
  SRS from a population of similar boys. For these two
  population, we wish to construct a 99% confidence interval for
  the difference between between the proportions of suicide
  attempters.
Solution:
  The 99% confident that for the sampled populations , the
  proportion of suicide attempts among girls exceeds the
  proportion of suicide attempts among boys by somewhere
  between 0.1450, 0.4556
Example:
  Goldberg et al. (A-24) conducted a study to determine if an
  acute dose of dextroamphetamine might have positive
  effects on affect and cognition in schizophrenic patients
  maintained on a regimen of haloperidol. Among the
  variables measured was the change in patients’ tension-
  anxiety states. For n2=4 patients who responded to
  amphetamine, the standard deviation was 5.8. Let us assume
  that these patients constitute independent SRS from
  populations of similar patients. Let us also assume that
  change scores in tension-anxiety state is a normally
  distributed variable in both populations. We wish to
  construct a 95% confidence interval for the ratio of the
  variances of these two populations.
Solution: σ1   2

  0.2081< σ2 <14.0554
               2

Mais conteúdo relacionado

Mais procurados

Central limit theorem
Central limit theoremCentral limit theorem
Central limit theoremVijeesh Soman
 
Sampling and statistical inference
Sampling and statistical inferenceSampling and statistical inference
Sampling and statistical inferenceBhavik A Shah
 
Lecture 5: Interval Estimation
Lecture 5: Interval Estimation Lecture 5: Interval Estimation
Lecture 5: Interval Estimation Marina Santini
 
Inferential statistics
Inferential statisticsInferential statistics
Inferential statisticsMaria Theresa
 
Confidence Intervals: Basic concepts and overview
Confidence Intervals: Basic concepts and overviewConfidence Intervals: Basic concepts and overview
Confidence Intervals: Basic concepts and overviewRizwan S A
 
Sample size determination
Sample size determinationSample size determination
Sample size determinationGopal Kumar
 
Statistical inference: Estimation
Statistical inference: EstimationStatistical inference: Estimation
Statistical inference: EstimationParag Shah
 
6.5 central limit
6.5 central limit6.5 central limit
6.5 central limitleblance
 
Estimation and hypothesis
Estimation and hypothesisEstimation and hypothesis
Estimation and hypothesisJunaid Ijaz
 
Point and Interval Estimation
Point and Interval EstimationPoint and Interval Estimation
Point and Interval EstimationShubham Mehta
 
Normal distribution
Normal distributionNormal distribution
Normal distributionSonamWadhwa3
 
Hypothesis testing ppt final
Hypothesis testing ppt finalHypothesis testing ppt final
Hypothesis testing ppt finalpiyushdhaker
 

Mais procurados (20)

Unit 3
Unit 3Unit 3
Unit 3
 
Central limit theorem
Central limit theoremCentral limit theorem
Central limit theorem
 
Sampling Distribution
Sampling DistributionSampling Distribution
Sampling Distribution
 
Confidence Intervals
Confidence IntervalsConfidence Intervals
Confidence Intervals
 
STATISTIC ESTIMATION
STATISTIC ESTIMATIONSTATISTIC ESTIMATION
STATISTIC ESTIMATION
 
Test of significance
Test of significanceTest of significance
Test of significance
 
Hypo
HypoHypo
Hypo
 
Sampling and statistical inference
Sampling and statistical inferenceSampling and statistical inference
Sampling and statistical inference
 
Sampling Distributions
Sampling DistributionsSampling Distributions
Sampling Distributions
 
Lecture 5: Interval Estimation
Lecture 5: Interval Estimation Lecture 5: Interval Estimation
Lecture 5: Interval Estimation
 
Inferential statistics
Inferential statisticsInferential statistics
Inferential statistics
 
Confidence Intervals: Basic concepts and overview
Confidence Intervals: Basic concepts and overviewConfidence Intervals: Basic concepts and overview
Confidence Intervals: Basic concepts and overview
 
Sample size determination
Sample size determinationSample size determination
Sample size determination
 
Statistical inference: Estimation
Statistical inference: EstimationStatistical inference: Estimation
Statistical inference: Estimation
 
6.5 central limit
6.5 central limit6.5 central limit
6.5 central limit
 
Estimation and hypothesis
Estimation and hypothesisEstimation and hypothesis
Estimation and hypothesis
 
Confidence interval
Confidence intervalConfidence interval
Confidence interval
 
Point and Interval Estimation
Point and Interval EstimationPoint and Interval Estimation
Point and Interval Estimation
 
Normal distribution
Normal distributionNormal distribution
Normal distribution
 
Hypothesis testing ppt final
Hypothesis testing ppt finalHypothesis testing ppt final
Hypothesis testing ppt final
 

Semelhante a Inferential statistics-estimation

Statistik 1 7 estimasi & ci
Statistik 1 7 estimasi & ciStatistik 1 7 estimasi & ci
Statistik 1 7 estimasi & ciSelvin Hadi
 
Point Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis testsPoint Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis testsUniversity of Salerno
 
Statistics Applied to Biomedical Sciences
Statistics Applied to Biomedical SciencesStatistics Applied to Biomedical Sciences
Statistics Applied to Biomedical SciencesLuca Massarelli
 
Estimating population values ppt @ bec doms
Estimating population values ppt @ bec domsEstimating population values ppt @ bec doms
Estimating population values ppt @ bec domsBabasab Patil
 
Stat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesisStat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesisForensic Pathology
 
Stat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesisStat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesisForensic Pathology
 
Stat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesisStat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesisForensic Pathology
 
3. Statistical inference_anesthesia.pptx
3.  Statistical inference_anesthesia.pptx3.  Statistical inference_anesthesia.pptx
3. Statistical inference_anesthesia.pptxAbebe334138
 
Review & Hypothesis Testing
Review & Hypothesis TestingReview & Hypothesis Testing
Review & Hypothesis TestingSr Edith Bogue
 
Estimation&amp;ci (assignebt )
Estimation&amp;ci (assignebt )Estimation&amp;ci (assignebt )
Estimation&amp;ci (assignebt )Mmedsc Hahm
 
Lesson04_Static11
Lesson04_Static11Lesson04_Static11
Lesson04_Static11thangv
 
Lesson04_new
Lesson04_newLesson04_new
Lesson04_newshengvn
 
Lect w2 measures_of_location_and_spread
Lect w2 measures_of_location_and_spreadLect w2 measures_of_location_and_spread
Lect w2 measures_of_location_and_spreadRione Drevale
 
Standard deviation
Standard deviationStandard deviation
Standard deviationMai Ngoc Duc
 

Semelhante a Inferential statistics-estimation (20)

Statistik 1 7 estimasi & ci
Statistik 1 7 estimasi & ciStatistik 1 7 estimasi & ci
Statistik 1 7 estimasi & ci
 
Point Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis testsPoint Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis tests
 
Statistics Applied to Biomedical Sciences
Statistics Applied to Biomedical SciencesStatistics Applied to Biomedical Sciences
Statistics Applied to Biomedical Sciences
 
10주차
10주차10주차
10주차
 
estimation
estimationestimation
estimation
 
Estimation
EstimationEstimation
Estimation
 
Estimating a Population Proportion
Estimating a Population Proportion  Estimating a Population Proportion
Estimating a Population Proportion
 
Estimating population values ppt @ bec doms
Estimating population values ppt @ bec domsEstimating population values ppt @ bec doms
Estimating population values ppt @ bec doms
 
Talk 3
Talk 3Talk 3
Talk 3
 
Stat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesisStat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesis
 
Stat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesisStat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesis
 
Stat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesisStat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesis
 
3. Statistical inference_anesthesia.pptx
3.  Statistical inference_anesthesia.pptx3.  Statistical inference_anesthesia.pptx
3. Statistical inference_anesthesia.pptx
 
Review & Hypothesis Testing
Review & Hypothesis TestingReview & Hypothesis Testing
Review & Hypothesis Testing
 
Estimation&amp;ci (assignebt )
Estimation&amp;ci (assignebt )Estimation&amp;ci (assignebt )
Estimation&amp;ci (assignebt )
 
Lesson04_Static11
Lesson04_Static11Lesson04_Static11
Lesson04_Static11
 
Lesson04_new
Lesson04_newLesson04_new
Lesson04_new
 
Estimating a Population Mean
Estimating a Population Mean  Estimating a Population Mean
Estimating a Population Mean
 
Lect w2 measures_of_location_and_spread
Lect w2 measures_of_location_and_spreadLect w2 measures_of_location_and_spread
Lect w2 measures_of_location_and_spread
 
Standard deviation
Standard deviationStandard deviation
Standard deviation
 

Mais de Southern Range, Berhampur, Odisha

Understanding Stress, Stressors, Signs and Symptoms of Stress
Understanding Stress, Stressors, Signs and Symptoms of StressUnderstanding Stress, Stressors, Signs and Symptoms of Stress
Understanding Stress, Stressors, Signs and Symptoms of StressSouthern Range, Berhampur, Odisha
 

Mais de Southern Range, Berhampur, Odisha (20)

Growth and Instability in Food Grain production in Oisha
Growth and Instability in Food Grain production  in OishaGrowth and Instability in Food Grain production  in Oisha
Growth and Instability in Food Grain production in Oisha
 
Growth and Instability of OIlseeds Production in Odisha
Growth and Instability of OIlseeds Production in OdishaGrowth and Instability of OIlseeds Production in Odisha
Growth and Instability of OIlseeds Production in Odisha
 
Statistical thinking and development planning
Statistical thinking and development planningStatistical thinking and development planning
Statistical thinking and development planning
 
CAN PLEASURE BE A JOURNEY RATHER THAN A STOP?
CAN PLEASURE BE A JOURNEY RATHER THAN A STOP?CAN PLEASURE BE A JOURNEY RATHER THAN A STOP?
CAN PLEASURE BE A JOURNEY RATHER THAN A STOP?
 
Monitoring and Evaluation
Monitoring and EvaluationMonitoring and Evaluation
Monitoring and Evaluation
 
Anger Management
Anger ManagementAnger Management
Anger Management
 
A Simple Promise for Happiness
A Simple Promise for HappinessA Simple Promise for Happiness
A Simple Promise for Happiness
 
Measures of mortality
Measures of mortalityMeasures of mortality
Measures of mortality
 
Understanding data through presentation_contd
Understanding data through presentation_contdUnderstanding data through presentation_contd
Understanding data through presentation_contd
 
Nonparametric and Distribution- Free Statistics _contd
Nonparametric and Distribution- Free Statistics _contdNonparametric and Distribution- Free Statistics _contd
Nonparametric and Distribution- Free Statistics _contd
 
Nonparametric and Distribution- Free Statistics
Nonparametric and Distribution- Free Statistics Nonparametric and Distribution- Free Statistics
Nonparametric and Distribution- Free Statistics
 
Simple and Powerful promise for Peace and Happiness
Simple and Powerful promise for Peace and HappinessSimple and Powerful promise for Peace and Happiness
Simple and Powerful promise for Peace and Happiness
 
Understanding Stress, Stressors, Signs and Symptoms of Stress
Understanding Stress, Stressors, Signs and Symptoms of StressUnderstanding Stress, Stressors, Signs and Symptoms of Stress
Understanding Stress, Stressors, Signs and Symptoms of Stress
 
Chi sqr
Chi sqrChi sqr
Chi sqr
 
Simple linear regressionn and Correlation
Simple linear regressionn and CorrelationSimple linear regressionn and Correlation
Simple linear regressionn and Correlation
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Sampling methods in medical research
Sampling methods in medical researchSampling methods in medical research
Sampling methods in medical research
 
Probability concept and Probability distribution_Contd
Probability concept and Probability distribution_ContdProbability concept and Probability distribution_Contd
Probability concept and Probability distribution_Contd
 
Probability concept and Probability distribution
Probability concept and Probability distributionProbability concept and Probability distribution
Probability concept and Probability distribution
 
Measures of dispersion
Measures  of  dispersionMeasures  of  dispersion
Measures of dispersion
 

Último

CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 

Último (20)

CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 

Inferential statistics-estimation

  • 1. Lecture Series on Biostatistics Inferential Statistics- Estimation By Dr. Bijaya Bhusan Nanda, M. Sc (Gold Medalist) Ph. D. (Stat.) Topper Orissa Statistics & Economics Services, 1988 bijayabnanda@yahoo.com
  • 2. CONTENTS  Introduction  Confidential interval for a population mean  The t distribution  Confidence interval for the difference between two population mean  Confidence interval for a population proportion  Confidence interval for the difference between two population proportion
  • 3. Introduction Statistical Inference  It is the procedure by which we reach a conclusion about a population on the basis of information contained in the sample drawn from that population.  Two broad areas of statistical inference  Estimation  Hypothesis testing.  Estimation- It is the process of calculating some statistics based upon the sample data drawn from a certain population. The statistics is used as an approximation to the population parameter.
  • 4. Types of Estimate For each of parameters, we can have  Point Estimate  Interval estimate  A point estimate is a single numerical value used to estimate the corresponding population parameter. An interval estimate consists of two numerical values defining a range of values that, with a specified degree of confidence, includes the parameter being estimated.
  • 5. Choosing an Appropriate Estimation  A single computed value is an estimate.  An estimator usually presented as a formula. For example x= ∑ xi n is the estimator of population mean  The single numerical value that results from evaluating this formula is called an estimate of the parameter µ.  E(T) is obtained by taking the average value of T computed from all possible sample i.e. E(T)= µT
  • 6. Criteria of a good estimator: There can be more than one estimator for the parameter. For example population mean can be estimated by the sample median or sample mean. But a good estimator should fulfill certain criteria. One such criterion of a good estimator is unbiasedness. An estimator T of the parameter θ is said to be an unbiased estimator of θ if E(T)= θ  Sample mean is an unbiased estimator of population mean.  Sample proportion is an unbiased estimate of population proportion.  The difference between two sample means is an unbiased estimate of difference between the population mean. The difference between two sample proportions is an unbiased estimate of difference between the population proportion.
  • 7. Sampled Population and Target Population  Sampled population is the population from which one actually draws a sample.  The target population is the population about which one wishes to make inference.  Statistical inference procedure allows one to make inference about sampled population (provided proper sampling methods have been employed)  Only when sampled population and the target population are the same, it is possible to reach statistical inference about the target population. Random and Non random Samples  The strict validity of the statistical procedures discussed depends on the assumptions of random sample.
  • 8.  In real world applications it is impossible or impractical to use truly random samples.  If the researchers had to depend on randomly selected materials, very little research of this type would be conducted.  Therefore, non statistical considerations must play a past in the generalizations process.  Researchers may contend that samples actually used are equivalent to simple random samples, since there is no reasons to believe that material actually used is not representative of the population about which inferences are desired.  In many health research projects, samples of convenience, rather than random samples are employed.
  • 9.  Generalization must be made on non statistical consideration.  Consequences of such generalization, however may range from misleading to disastrous.  In some situation it is possible to introduce randomization into experiment even though available subjects are not randomly selected from well defined population.  Example: Allocating subjects to treatments in a random manner.
  • 10. Confidential interval for a population mean  Say x the sample mean from a random sample of size n is drawn from a normally distributed population.  x is an unbiased estimator of the population mean  () E x =µ  But because of sampling fluctuation x can’t be expected to be equal to µ.  Therefore, we should have an interval estimate µ.  For the interval estimate we must have knowledge of sampling distribution of x .  Sampling distribution of x for a normal population is as follows x ≈ N ( µ , σ )i.e.N ( µ ,σ ) n x x
  • 11.  Following the normal probability distribution the interval estimate for µ is as follows 95% confidence interval for µ = µ ± 2σ x since we don’t know µ with the help of sample mean x we have the 95% confidence interval for µ as x ± 2σx = x ± 2 σ n If we don’t know σ the 95% confidence interval for µ = x ± 2 s / n where s= the standard deviation based on the sample.
  • 12. Interval estimate Component  In general an interval estimate may be expressed as follows estimator ± (reliability coefficient)× (standard error)  In particular when sampling is from a normal distribution with known variance, an interval estimator for µ may be expressed as x±z σ (1 − α / 2 ) x where z(1-α /2) is the value of z to the left of which lies 1- α /2 and to the right of which lies α /2 of the area under its curve.
  • 13. Interpreting Confidence Interval:  Probabilistic Interpretation: In repeated sampling, from a normally distributed population with a known standard deviation, 100(1- α ) percent of all intervals of the form x±z σ (1 − α / 2 ) x will in the long run include the population mean µ .  Practical Interpretation: When sampling is from a normally distributed population with known standard deviation, we are 100(1- α ) % confident that the single computed interval, population mean µ . x±z σ (1 − α / 2 ) x , contains the  Precision: The quantity obtained by multiplying the reliability factor by the standard error of the mean is called the precision of the estimate. This quantity is also called the margin of error.
  • 14. Example A physical therapist wished to estimate, with 99 percent confidence , the mean maximum strength of a particular muscle in a certain group of individuals. He is willing to assume that strength scores are approximately normally distributed with a variance of 144. A sample of 15 subjects who participated in the experiment yielded a mean of 84.3. Solution: The z value corresponding to a confidence coefficient of 0.99 is found in Table D to be 2.58. This is our reliability coefficient. The standard error is σx = 12 = 3.0984 .Our 99% 15 84.3 ± 2.58(3.0984) Confidence interval for µ, then 84.3 ± 8.0 76.3,92.3
  • 15. We say we are 99% confidence that the population mean is between 76.3 and 92.3 since, in repeated sampling, 99% of all intervals that could be constructed in the manner just described would include the population mean.
  • 16. The t distribution  Estimation of confidence interval using standard normal z distribution applies when population variance is known. Usually, the knowledge of population variance is not known.  This condition presents a problem to construct confidence intervals. x −µ Although the statistic z = σ is normally distributed n or approximately normally distributed when n is large, regardless of the functional form of the population, we can’t make use of this fact because σ is not known.
  • 17.  In this case we may use of s= ∑ ( xi − x ) 2 ( n − 1) as an estimator of σ. When sample size is large, say, greater than 30, s is a good approximate of σ is substantial and we may use normal distribution theory to construct confidence interval.  When the sample size is small, ≤30, we make use of Student’s t distribution. x−µ t= The quantity s follows this distribution. n
  • 18. Properties of the t distribution:  It has a mean of 0.  Symmetrical about the mean.  In general, it has a variance greater than 1, but the variance approaches 1 as the sample size becomes large. For df >2, the variance for t distribution is df /(df-2).  The variable t ranges from -∞ to + ∞ .  The t distribution is really a family of distributions, since there is a different distribution for each sample value of n-1, the divisor used in computing s2. We recall that n-1 is referred to as degrees of freedom.  Compared to normal distribution is less peaked at center and has higher tails t distribution approaches the normal distribution an n-1 approaches infinity.  The t distribution approaches the normal distribution as n-1 approaches infinity.
  • 19. The t distribution for different degrees of freedom Degrees of freedom=30 Degrees of freedom=5 Degrees of freedom=2 Comparison of normal distribution and t distribution Normal Distribution t distribution
  • 20. Confidence Intervals Using t: estimator ± (reliability coefficient) (standard error) error The source of reliability co-efficient is ‘t’ distribution rather than standard normal z distribution.  When sampling is from a normal distribution whose standard deviation , σ , is unknown the 100(1-α )% confidence interval for the population mean µ , is given by s x ± t (1 − α / 2 ) n  A moderate departures of the sampled population from the normally can be tolerated in using the ‘t’ distribution. An assumption of mound shaped population distribution be tenable.
  • 21. Example: Maureen McCauley conducted a study to evaluate the effect of on the job body mechanics instruction on the work performance of newly employed young workers (A-1). She used two randomly selected groups of subjects, an experimental group and a control group. The experimental group received one hour of back school training provided on occupational therapist. The control group did not receive this training. A criterion referenced Body Mechanics Evaluation Checklist was used to evaluate each workers lifting, lowering, pulling, and transferring of objects in the work environment. A correctly performed task received a score of 1. The 15 control subjects made a mean score of 11.53 on the evaluation with a standard deviation of 3.681. We assume that these 15 controls behave as a random sample from a population of similar subjects. We wish to use these sample data to estimate the mean score fro the population.
  • 22. Solution: We may assume sample mean, 11.53, as a point estimate of population mean but since the population standard deviation is unknown, we must assume the population of values to be at least approximately normally distributed before constructing a confidence interval for µ . Let us assume an assumption is reasonable and that a 95% confidence interval is desired. Our estimator x is and our standard error is s n = 3.681 = 0.9564 15 We need to find reliability coefficient, the value of t associated with a confidence coefficient of 0.95 and n- 1=14 degrees of freedom.
  • 23. Confidence interval is equally divided in to two tails i.e.0.025 .In Table E the tabulated value of t is 2.1448, which is our reliability coefficient. Then our 95% confidence interval is as follows 11.53 ± 2.1448(0.9504) 11.53 ± 2.04 9.49,13.57 Deciding Between z and t: To make an appropriate choice between z and we must consider whether the sampled population is normally distributed, and whether the population variance is known.
  • 24. Flowchart for use in deciding between z and t when making inferences about population means Normal Y Population N Large Large Y Sample N Y Sample N Populn Y Populn N Y Populn N Y Populn N Y N var var var var known? known? known? known? z t z t z z * * z Central limit theorem applies * = use nonparametric Procedure
  • 25. Confidence interval for the difference between two population means Normal Population  Say µ -µ = the difference between two normal population 1 2 means.  x −x 1 2 = the difference between two independent sample means drawn from the two populations respectively. E ( x 1 − x 2) = µ1 − µ 2 σ1 σ 2 2 V ( x 1 − x 2) = + 2 n1 n 2 Where σ12 & σ22 are the variance of the population 1&2 and n1, n2 are the size of the SRS drawn from population 1&2 respectively.
  • 26.  When the population variances are known, the 100(1- α)% confidence interval for µ1& µ2 is given by σ σ 2 2 (x − x ) ± z + 1 2 1 2 (1 − α / 2 ) n 1 n 2  When the confidence interval includes zero, we say that the population means may be equal when it doesn’t includes zero we say that the interval provides evidence that the two population means are not equal.
  • 27. Example: A research team is interested in the difference between serum uric acid levels in patients with and without Down’s syndrome. In a large hospital for the treatment of the mentally retarded, a sample of 12 individuals with Down’s syndrome yielded a mean of 4.5mg/100ml. In a general hospital a sample of 15 normal individuals of the same age and sex were found to have a mean value of 3.4. If it is reasonable to assume that the two populations of values are normally distributed with variances equal to 1 and 1.5, find the 95% confidence interval for µ1- µ2. Solution: For a point estimate of µ1- µ2, we use x 1 − x 2 = 4.5 − 3.4 = 1.1
  • 28. The reliability coefficient corresponding to 0.95 is found in Table D to be 1.96. The standard error σ 2 σ 1 1 .5 2 σ ( x 1 − x 2) = +1 = +2 = 0.4282 n1 n2 12 15 The 95 percent confidence interval, then, is 1.1 ±1.96(0.4282) 1.1 ± 0.84 0.26,1.94 The difference µ1- µ2, is some where between 0.26 and 1.94, because, in repeated sampling, 95% of the intervals constructed in this manner would include the difference between the true means. Since the interval does not include zero, we conclude that the two population means are not equal.
  • 29. Sampling from Nonnormal Population: If sample sizes n1 and n2 are large, we may construct the confidence interval the same way as we do incase of normal population in accordance with the Central Limit Theorem. Example: Motivated by an awareness of the existence of a body of controversial literature suggesting that stress, anxiety, and depression are harmful to the immune system, Gorman et al. (A-5) conducted a study in which the subjects were homosexual men and some of whom were HIV positive and some of whom were HIV negative. Data were collected on a wide variety of medical, immunological, psychiatric, and neurological measures, one of which was the number of CD4+ cells in the blood. The mean number of CD4+ cells for the 112 men with HIV infection was 401.8 with a standard deviation of 226.4. For the 75 men without HIV infection the mean and standard deviation were 828.2 and 274.9, respectively. We wish to construct a 99% confidence interval for the difference between population means.
  • 30. Solution: Here we are using z statistic as the reliability factor in the construction of our confidence interval. Since the population standard deviation are not given, we will use the sample standard deviations to estimate them. The point estimate for the difference between population means is the difference between sample means, 882.2-401.8=426.4. In Table D, we find the reliability factor to be 2.58. The estimated standard error is 2 2 274.9 226.4 sx − x = + = 38.279 75 112 1 2
  • 31. Our 99% confidence interval for the difference between population means is 426.4 ± 2.58(38.279) 327.6,525.2 We are 99% confident that the mean number of CD4+ cells in HIV +ve males differs from the mean for HIV –ve males by some where between 327.6 and 525.2.
  • 32. Confidence interval using ‘t’ distribution:  Say the variances of the two sampled population are not known.  The two sampled population are normally distributed.  The difference between the two population means can be estimated through confidence interval using ‘t’ distribution as the source of reliability factor. Situation-I-Equal population variances: The 100(1-α)% confidence interval for µ1- µ2 is given by 2 2 s s ( x 1 − x 2) ± t (1 − α / 2 ) p + p n1 n2 If the two sample sizes are unequal , the weighted average takes advantages of the additional information provided by the larger sample. The pooled estimate is given by
  • 33. ( n1 −1) s + ( n 2 −1) s 2 2 s 2 p = 1 2 n1 + n 2 − 2 The standard error of the estimate , then, is given by 2 2 s s Sx 1 − x2 = p + p n1 n2 The ‘t’ follows a Student’s t distribution with n1+n2-2 d.f. Example: The purpose of a study by Stone et al.(A-6) was to determine the effects of long-term exercise intervention on corporate executives enrolled in a supervised fitness program.
  • 34. Data were collected on 13 subjects (the exercise group) who voluntarily entered a supervised exercise program and remained active for an average of 13 years and 17 subjects (the secondary group) who elected not to join the fitness program. Among the data collected on the subjects was maximum number of sit-ups completed in 30 seconds. The exercise group had a mean and standard deviation for this variable of 21.0 and 4.9 respectively. The mean and standard deviation for the secondary group were 12.1 and 5.6 respectively. We assume that the two population of overall muscle condition measures are approximately normally distributed and that the two population variances are equal. We wish to construct a 95% confidence interval for the difference between the means of the populations represented by these two samples. Solution: The 95% confident that the difference between population means is somewhere between 4.9, 12.9 .
  • 35. Situation-II-Unequal population variances: An approximate 100(1-α)% confidence interval for µ1- µ2 is given by 2 s1 s 2 2 x 1 − x 2 ± t ′(1 − α / 2 ) + n1 n2 The solution proposed by Cochran consists of computing the reliability factor t ′ = w1t1 + w2 t 2 (1−α 2) w1 + w2 Where w1=s12/n1, w2=s22/n2, t1=t1-α/2 for n1-1 degrees of freedom, and t2=t1- α/2 for n2-1 degrees of freedom. Example: in the study by Stone et al.(A-6)described in previous example , the investigator also reported the following information on a measure of overall muscle condition scores made by the subjects:
  • 36. Sample n Mean S.D Exercise 13 4.5 0.3 group Sedentary 17 3.7 1.0 group We assume that the two populations of overall muscle condition scores are approximately normally distributed. We are unwilling to assume, however, that the two population variances are equal. We wish to construct a 95% confidence interval for the difference between the mean overall muscle condition scores of the two populations represented by the samples. Solution: The 95% confidence interval for the difference between the two population means when they are not equal is 0.25, 1.34
  • 37. Flowchart in deciding whether the reliability factor should be z, t, or t’ when making inferences about the difference between two population means Normal Y Population N Y Large N Large Sample Y N Sample Populn Populn Populn Populn Y N Y N Y N Y N var var var var known? known? known? known? =? =? =? =? =? =? =? =? Y N Y N Y N Y N Y N Y N Y N Y N z z t t’ z z t t’ z z z t’ * * * * or or * = Nonparametric test to be applied z z
  • 38. Confidence interval for a population proportion  Population proportion in many situations is a matter of great interest to the health professional.  What proportion of people suffers from different diseases, disability etc.? what proportion of patience response to a particular treatment? These are certain important questions for the health purpose.  We estimate the population proportion as in the case of population mean.  A random sample is drawn from the population of interest and the sample proportion p provides an unbiased estimate ˆ of the population proportion P.  A confidence interval is obtained by the general formula estimator ± (reliability coefficient) × (standard error)  when both np and n(1-p) are greater than 5, the sampling distribution of p is quite close to the normal distribution. ˆ
  • 39. When above condition is mate100(1-α) % confidence interval for p is given by p ± z (1 − α / 2 ) ˆ p(1 − p ) ˆ ˆ Example: n Mathers et al.(A-12) found that in a sample of 591 patients admitted to a psychiatric hospital, 204 admitted to using cannabis at least once in their lifetime. We wish to construct a 95% confidence interval for the proportion of lifetime cannabis users in the sampled population of psychiatric hospital admission. Solution: The 95% confident that the population proportion p is between 0.3069, 0.3835
  • 40. Confidence interval for the difference between two population proportion  An unbiased point estimator of the difference between two population proportions (p1-p2) is provided by the difference ˆ ˆ between sample proportion p1 − p 2 .  When n1 and n2(sample size) are large and the population proportions are not too close to 0 and 1, normal distribution theory may be applied to obtain confidence interval.  A 100(1-α)% confidence interval for p1-p2 is given by p1 (1 − p1) ˆ ˆ p 2 (1 − p 2 ) ˆ ˆ p 2 − p 2 ± z (1 − α / 2 ) ˆ ˆ + n1 n2
  • 41. Example: Borst et al. (A-16) investigated the relation of ego development, age, gender and diagnosis to suicidality among adolescent psychiatric inpatients. There sample consisted of 96 boys and 123 girls between the age of 12 and 16 selected from admissions to a child and adolescent unit of a private psychiatric hospital. Suicide attempts were reported by 18 of the boys and 60 of the girls. Let us assume the girls and the boys behave like a simple random sample from a population of similar girls and that the boys likewise may be considered a SRS from a population of similar boys. For these two population, we wish to construct a 99% confidence interval for the difference between between the proportions of suicide attempters. Solution: The 99% confident that for the sampled populations , the proportion of suicide attempts among girls exceeds the proportion of suicide attempts among boys by somewhere between 0.1450, 0.4556
  • 42. Example: Goldberg et al. (A-24) conducted a study to determine if an acute dose of dextroamphetamine might have positive effects on affect and cognition in schizophrenic patients maintained on a regimen of haloperidol. Among the variables measured was the change in patients’ tension- anxiety states. For n2=4 patients who responded to amphetamine, the standard deviation was 5.8. Let us assume that these patients constitute independent SRS from populations of similar patients. Let us also assume that change scores in tension-anxiety state is a normally distributed variable in both populations. We wish to construct a 95% confidence interval for the ratio of the variances of these two populations. Solution: σ1 2 0.2081< σ2 <14.0554 2