SlideShare uma empresa Scribd logo
1 de 16
Baixar para ler offline
Statistical Techniques




                                      Inferential Statistics




Term paper - Inferential Statistics                               1
Statistical Techniques

Table of Contents
1.0 WHAT IS INFERENTIAL STATISTICS?.................................................................................................3
2.0 BRIEF TIMELINE OF INFERENTIAL STATISTICS .............................................................................4
3.0 TEST OF WHAT? ..........................................................................................................................................6
       3.1 HYPOTHESIS – INTRODUCTION ...............................................................................................6
             3.1.2 ERRORS IN SAMPLING.................................................................................................7
             3.1.3 STUDENT’s T-TEST ........................................................................................................9
             3.1.4 CHI-SQUARE TEST.......................................................................................................10
       3.2 REGRESSION? ...............................................................................................................................12
             3.2.1 REGRESSION MODELS...............................................................................................12
             3.2.2 SCATTER-PLOTS ..........................................................................................................12
             3.2.3 REGRESSION EQUATION ..........................................................................................12
             3.2.4 REGRESSION INTERPRETATION............................................................................15
             3.2.5 R SQUARRED .................................................................................................................15
4.0 BIBLIOGRAPHY.........................................................................................................................................16




Term paper - Inferential Statistics                                                                                                                        2
Statistical Techniques

1.0 WHAT IS INFERENTIAL STATISTICS?

The vital key to the difference between descriptive and inferential statistics are the capitalized words
in the description: CAN DESCRIBE, COULD NOT CONCLUDE, AND REPRESENTATIVE OF.
Descriptive statistics can only describe the actual sample you study. But to extend your conclusions to
a broader population, like all such classes, all workers, all women, you must use inferential statistics,
which means you have to be sure the sample you study is representative of the group you want to
generalize to.
Allow me to exemplify:

          i.     The study at the local mall and cannot be used to claim that what you find is valid for
                 all shoppers and all malls.
         ii.     Another example would be a study conducted on an intermediate college can’t claim
                 that what you find is valid for the colleges of all levels (i.e. General Population).
        iii.     Also visualize a survey conducted at a women's club that includes a majority of a
                 particular single ethnic group cannot claim that what you find is valid for women for all
                 ethnic groups.

       As you can see, descriptive statistics are useful and serviceable if you don't need to extend your
       results to whole segments of the population. But the social sciences tend to esteem studies that
       give us more or less "universal" truths, or at least truths that apply to large segments of the
       population, like all teenagers, all parents, all women, all perpetrators, all victims, or a fairly
       large segment of such groups.

       Leaving aside the theoretical and mechanical soundness of such an investigation for some kind
       of broad conclusion, various statistical approaches are to be utilized if one aspires to
       generalize. And the primary distinction is that of SAMPLING. One must choose a sample that
       is REPRESENTATIVE OF THE GROUP TO WHICH YOU PLAN TO GENERALIZE.

       To round up, Descriptive statistics are for describing data on the group you study, While
       Inferential statistics are for generalizing your findings to a broader population group.




Term paper - Inferential Statistics                                                                         3
Statistical Techniques

2.0 BRIEF TIMELINE OF INFERENTIAL STATISTICS




1733       1733 - In the 1700s, it was Thomas Bayer who gave birth to the concept of inferential
           statistics. The normal distribution was discovered in 1733 by a Huguenot refugee de Moivre
           as an approximation to the binomial distribution when the number of trials is too large.
           Today, not only do scientists but also many professions rely on statistics to understand
           behaviour and ideally make predictions about what circumstances relate to or cause these
           behaviours.
1796       Historical Note: In 1796, Adophe Quetelet investigated the characteristics of French
           conscripts to determine the "average man." Florence Nightingale was so influenced by
           Quetelet's work that she began collecting and analyzing medical records in the military
           hospitals during the Crimean War. Based on her work hospitals began keeping accurate
           records on their patients, to provide better follow-up care.

1894       1894 - At the inception of the social survey, research results were confronted with the
           developments in inferential statistics. In 1894, Booth wrote The Aged Poor in England and
           Wales: Conditions? In this volume Booth claimed that there was no relationship between the
           ratio of welfare (out-of-doors relief) and workhouse relief (in-relief and the incidence of
           poverty by parish (or poor law union).

1896       Dec 1896 - Walker died rather suddenly at the age of 56, just days after giving the address
           opening the first meeting of ASA outside Boston—in Washington, DC in December 1896.
           That meeting led to the founding of the Washington Statistical Society. His achievements in
           developing major federal data systems, in promoting the organizational development of
           statistics, and of bringing statistical ideas to a wide audience, left the field much richer than
           he found it.

1899       1899 - Since inferential social statistics are primarily concerned with correlation and
           regression. To prove this Yule published his paper on poverty in London in 1899, this
           concern has occurred in a context of establishing causality. Often investigators seem to view
           statistical modeling as being equivalent to a regression model. The reader is cautioned that
           my critique of regression analysis is not necessarily equivalent to denying the value of
           empirical research.


Term paper - Inferential Statistics                                                                            4
Statistical Techniques

1925       1925 – First, Sigmund Freud had developed a theory that self-explained the reasons for
           aggression and juvenile criminal behaviours in terms of childhood experiences. Second, in
           1925, RA Fisher published Statistical Methods for Research Workers in which he identified
           an effective experimental paradigm that included control groups and inferential statistics.
           Freud's theory and Fisher's paradigm provided a basis so that mental health professionals
           could initiate studies to identify many mental behaviours

1930       IN 1930, THE YEAR the CH Stoelting Co. of Chicago published what was to be the largest-
           ever catalog of psychological apparatus, there was virtually no use of inferential statistics in
           psychology, in spite of the fact that William Sealey Cosset had long since presented the
           T-test and Sir Ronald Fisher had presented the general logic of null hypothesis testing. Only
           after Fisher's epochal introduction to analysis of variance procedures did psychologists even
           notice the procedure.
1930       1930 - The fiducial argument, which Fisher produced in 1930, generated much controversy
           and did not survive the death of its creator. Fisher created many terms in everyday use, eg
           statistic and sampling distribution and so there are many references to his work on the
           Words pages. Symbols in Statistics are his contributions to notation.
1935       1935 - In the two decades following the publication of Ronald Aylmer Fisher's Design of
           Experiments in 1935, Fisher's link between experimental design and inferential statistics
           became institutionalized in American experimental psychology.
1936       Apr 27, 1936 - . Pearson founded the journal Biometrics and was the editor of Annals of
           Eugenics. Because of his fundamental work in the development of modern statistics, many
           scholars today regard Pearson as the founder of 20th-century statistics. He died in
           Coldharbour, England, on April 27, 1936.
1977       1977 - The youth violence prevention landscape has changed drastically in the last quarter
           century. In 1977, Wright and Dixon published a review of “Juveniles delinquency
           prevention program” reports. The results were disappointing. From approximately 6600
           program abstracts, empirical data were available from only 96 . Of the 96 empirical reports,
           only 9 used random assignment of subjects, inferential statistics, outcomes measure of
           delinquency, and a follow-up period of at least six months. Of those 9, only 3 reported
           positive outcomes, and these three were based on the three smallest sample sizes among the
           9 reports. The authors concluded that the literature was low in both scientific and policy
           utility. By contrast today dozens of summaries of research on prevention practices are
           available.




Term paper - Inferential Statistics                                                                           5
Statistical Techniques

   1981       Jun 3, 1981 - Education Practical Statistics for Educators An introduction to the basic ideas
              of descriptive and inferential statistics as applied to the work of the classroom teachers
              counselors and administrators In the public schools Emphasis Is upon practical applications
              of statistics to problems.

   1986       Dec 17, 1986 - Koop acknowleged that the proof of these smoker’s deaths was "inferential,
              of course," based on analyses of statistics gathered in past studies, including several in
              Japan, Hong Kong, Taiwan, Europe and the United States.

   1995       Jan 1995 - A jury trial on compensatory damages was held in January 1995. Dannemiller
              testified that the selection of the random sample met the standards of inferential statistics,
              that the successful efforts to locate and obtain testimony from the claimants in the random
              sample "were of the highest standards " in his profession, that the procedures followed
              conformed to the standards of inferential statistics.



3.0 TEST OF WHAT?

  Tests of significance are helpful in problems of generalization. A Chi-Square or a T-Test tells you the
  probability that the results you found in the group under study represent the population of the chosen
  group. It can be frequently observed, Chi-Square or a t-test gives you the probability that the results
  found could have occurred by chance when there is really no relationship at all between the variables
  you studied in the population.

  A known method used in inferential statistics is estimation. In estimation, the sample is used to
  estimate a parameter, and a confidence interval about the estimate is constructed. Other examples of
  inferential statistics methods include

             i.    Hypothesis testing
            ii.    Linear regression

  3.1 HYPOTHESIS – INTRODUCTION
  Hyptothesis is a statement about the population parameter or about a population distribution. The testing of
  hypothesisis conducted in two phases. In the first phase, a test is designed where we decide as to when can the
  null hypothesis be rejected. In the second phase, the designed test is used to draw the conclusion. Hypothesis
  testing is to test some hypothesis about parent population from which the sample is drawn.
  DEFINITIONS
  PARAMETER - The statistical constants of the population namely mean (µ) , variance are usually referred to
  as parameters.
  STATISTIC - Statistical measures computed from the sample observations alone namely mean X Variance S2
  have been termed as Statistic.


  Term paper - Inferential Statistics                                                                               6
Statistical Techniques

UNBIASED ESTMATE - A statistic t = t (X1, X2, …..Xn), a function of the sample values X1, X2,…….Xn is
an unbiased estimate of the population parameter 0, if E(t) = 0. In other words, if
E(Statistic) = Paramater, then statistic is said to be an unbiased estimate of the parameter.
SAMPLING DISTRIBUTION OF A STATISTIC - If we draw a sample of size n from a given finite population
of size N, then the total number of possible samples is               /n!(N-n)! = K
STANDARD ERROR - The standard deviation of the sampling distribution of a statistic is known as it standard
error.
NULL HYPOTHESIS - A definite statement about the population parameter which is usually a hypothesis of
no difference is called Null Hypothesis an is usually denoted by Ho
ALTERNATIVE HYPOTHESIS - Any hypothesis which is complementary to the null hypothesis is called an
alternative hypothesis usually denoted by H1. For example, if we want to test the null hypothesis that the
population has a specified mean Mo (say) is Ho : µ - µo then the alternative hypothesis could be
    a) H1 : µ ≠ µ o
    b) H1 : µ > µ o
    c) H1 : µ < µ o

3.1.1 PROCEDURE FOR TESTING OF HYPOTHESIS
Various steps in testing of a statistical hypothesis in a systematic manner :
    1. Null hypothesis : Set up the null hypothesis H0
    2. Alternative Hypothesis : Set up the alternative hypothesis H1. This will be enable us to decide whether
         we have to use a single tailed(right or left) test of two-tailed test.
    3. Level of Significance : To choose the appropriate level of significance (x)
    4. Test Statistic : To compute the test statistic : Z = t-E(t)/S1E(t) , under Ho
    5. Conclusion : We compare the computed value of Z with the significant value Z2, at the given level of
         significance, if │z │ <z2 we say it not significant, it │z│ >z is then we say that it is significant and the
         null hypothesis is rejected at level of significance.



3.1.2 ERRORS IN SAMPLING
The main objective in sampling theory is to draw valid inference about the population parameters on the basis
of the sample results. In practice, we decide to accept or reject the lot after examining a sample from it. As such
we are able to commit the following two types of errors :
TYPE 1 ERROR : Reject Ho when it is true,
TYPE II ERROR : Accept Ho when it is wrong, ie. Accept Ho when H1 is true
If we mention P(Accept Ho when it is wrong) = P(Accept Ho/H1) = β and
P(Reject Ho when it is true) = P(Reject Ho/H1) = x then 2 and β are called the sizes of type 1 error and type II
error, respectively. In practice, type I error amounts to rejecting a lot when it is good and type II error may be




Term paper - Inferential Statistics                                                                                  7
Statistical Techniques

regarded as accepting the lot when it is bad. Thus P(Reject a lot when it is good) = α and P (Accept a lot when
it is bad) = β where α and β are referred to as producer’s risk and consumer’s risk respectively.
CRITICAL REGION
A region in the sample splace S which amounts to rejection of Ho is termed as critical region of rejection.
ONE – TAILED AND TWO-TAILED TESTS
Ho: µ > µ o (Right-tailed), the critical region lies entirely in the right tail
H1: µ < µ o (left-tailed), the critical region lies entirely in the left tail.
A test of statistical hypothesis where the alternative hypothesis is two – tailed tests such as Ho:µ=µ o against the
alternative hypothesis H1:µ=µ o isknown as two tailed test and in such a case the critical region is given by the
portion of the area lying on both tailsof the probability curve of the test statistic.
CRITICAL VALUE OR SIGNIFICANT VALUES
The value of test statistic which separates the critical (or rejection) region and the acceptance region is called
the critical value or significant value. It depends on :
    1) The level of significance used, and
    2) The alternative hypothesis, whether it is two-tailed of single-tailed

The standardized variable corresponding to the statistic t namely Z =

The value of z above under the null hypothesis is known as test statistic.
The critical value of the test statistic at level of significance 2 for a two-tailed test is given by Z, where Z is
determined by the equation : P(1Z1>Z o ) = α i.e., Zα is the value so that the total area of the critical region on
both tails is 2. Since normal probability curve is a symmetrical curve.
In case of a single-tail alternative, the critical value of Zα is determined so that total area to the right of it (for
right-tailed test) is α and for left-tailed test the total area to the left of (-Zα) is α
Thus the significant or critical value of Z for a single-tailed test (left or right) at level of significance α is same
as the critical value of Z for a two-tailed test at level of significance ‘α’. Please find below the critical values of
Z at commonly used levels of significance for both two-tailed and single-tailed tests




  Critical Value Z2                                       LEVEL OF SIGNIFICANCE
                               1%                             5%                            10%
Two tailed test                │Zα│ = 2.58                    │Zα│ = 1.96                   │Zα│ = 1.645
Right tailed test              Zα = 2.33                      Zα = 1.645                    Zα = 1.28
Left tailed test               Zα = 2.33                      Zα = 1.645                    Zα = 1.28


TEST OF SIGNIFICANCE OF A SINGLE MEAN
If X1, X2, …….Xn, in a random sample of size n from a normal population with mean M and variance 2, then
the sample mean is distributed normally with mean M and variance .


Term paper - Inferential Statistics                                                                                       8
Statistical Techniques

Null Hypothesis, Ho - The sample has been drawn from a population with mean M and variance , ie there is no
significance difference between the sample mean(X) and population mean(M), the test statistic (for large


samples), is Z =

If the population Standard Deviation is unknown, then we use its estimate provided by the sample variance
given by                         (for large samples)
TEST OF SIGNIFICANCE FOR DIFFERENCE OF MEANS
The mean of random sample of size n, from a population with Mean M, and Variance and let be the mean of an
independent random sample of size n2 from another population with mean M2 and variance ? then, since
sample size are large.
TEST OF SIGNIFICANCE FOR THE DIFFERENCE OF STANDARD DEVIATION
If S1 and S2 are the standard deviation of two independent samples, then under null hypothesis, Ho :      1= 2   i.e
the sample standard deviations don’t differ significantly.
                         1)           (for large samples)

But in case of large samples, the S.E of the difference of the sample standard deviations is given

by



3.1.3 STUDENT’s T-TEST
The entire large sample theory was based on the application of “normal test”. However if the sample size n is
small, the distribution of the various statistics are far from normally and as such ‘normal test’ cannot be applied
if n is mall. In such cases exact sample tests, pioneered by W.S.Gosst(1908) who wrote under the pen name-of
student, and later on developed and extended by Prof.R.A.Fisher(1926) are used.
Applications Of T-Distribution
     The t-distribution has a wide number of applications in statistics, and some of which are
         1) To test if the sample mean(     ) differs significantly from the hypothetical value µ of the population
             mean.
         2) To test the significance of the difference between two sample means.
         3) To test the significance of an observed sample correlation and sample regression coefficient.
         4) To test the significance of observed partial correlation coefficient.

T-Test For Single Mean

All hypothesis testing is done under the assumption the null hypothesis is true


Population Standard Deviation Known
If the population standard deviation, sigma, is known, then the population mean
has a normal distribution, and you will be using the z-score formula for sample means. The test

Term paper - Inferential Statistics                                                                                    9
Statistical Techniques

statistic is the standard formula you've seen before. The critical value is obtained from the normal table, or
the bottom line from the t-table.

Population Standard Deviation Unknown

If the population standard deviation, sigma, is unknown, then the population mean has a
student's T-distribution, and you will be using the t-score formula for sample means. The
test statistic is very similar to that for the z-score, except that sigma has been replaced by s
and z has been replaced by t.

The critical value is obtained from the t-table. The degree of freedom for this test is n-1.


If you're performing a t-test where you found the statistics on the calculator (as opposed to being given them in
the problem), then use the VARS key to pull up the statistics in the calculation of the test statistic. This will
save you data entry and avoid round off errors.

General Pattern

Notice the general pattern of these test statistics is (observed - expected) / standard deviation.




3.1.4 CHI-SQUARE TEST

A chi-square test (also chi squared test or χ2 test) is any statistical hypothesis test in which the sampling
distribution of the test statistic is a chi-square distribution when the null hypothesis is true, or any in which this
is asymptotically true, meaning that the sampling distribution (if the null hypothesis is true) can be made to
approximate a chi-square distribution as closely as desired by making the sample size large enough.

Chi-Square Test In Contigency Table

CHI-SQUARE distribution is utlised to determine the critical value of the chi-square variate at various level of
significance.
Properties :

(1) The value of chi-square varies from 0 to α.            (2) When each Oi = Ei, the value of chi-square is zero.

(3) Chi-square can never be negative




Term paper - Inferential Statistics                                                                                 10
Statistical Techniques

CONTIGENCY TABLE : The test if independence of attributes when the frequencies are presented in a two way
table according to two attributes classified to various categories known as the contigency table.

Test of hypothesis in a contingency table.

A contingency is a rectangular array having rows and colums ascertaining to the categories of the attributes of
A & B. The null hypothesis : H0 : Two attributes are independent vs H1 : two attributes are dependant on each
other




.

Statistics X2 has (p-1) (q-1) d.f



Under Ho, the indepdendence of attributes, the expected frequency,

Eij = ith row total x jith column
                 N


= Ri x Cj
     n


Decision : The calculated value compared with tabulated value of X2 for (P-1) (Q-1) d.f. & prefixed level of
significance α. Calculation X2 > reject Ho, if Calculation < X2 – tab – accept Ho.

CONTIGENCY TABLE OF ORDER 2X2

DIRECT FORMULAR FOR 2x2 = n(ad-bc)2

                                    (a+b) (c+d) (a+c) (b+d)                X2 has 1 d.f.

                            B1                           B2
A1                          A                            B                         a+b
A2                          C (cell)                     D                         c+d
                            a+c                          B+d                       a+b+c+d = n


Calculation X2>X2α1, reject Ho

Calculation X2<x2α1, accept Ho


Term paper - Inferential Statistics                                                                            11
Statistical Techniques




3.2 REGRESSION?
Regression analysis is a method for determining the relationship between two variables. The regression
statistical skeleton is at the core of observed social and political science research. Regression analysis works as
a statistical substitute for controlled experiments, and can be used to make causal inferences.


3.2.1 REGRESSION MODELS
Researchers render verbal theories, hypothesis, even intuition into models. A model illustrates how and under
what circumstances two (or more) variables are linked. A regression model with a dependent variable and one
independent variable is known as a bi-variate regression model.
A regression model with a dependent variable and two or more independent variables and/or control variables is
known as a multivariate regression model.
Example: The dataset "Televisions, Doctors, and Smokers" contains, among other variables, the number of
smokers per television set and the number of smokers per physician for 50 countries.


3.2.2 SCATTER-PLOTS
The X axis normally depicts the values of the independent variable, while the Y axis represents the value of the
dependent variable.
Scatter-plots allow you to study the flow of the dots, or the relationship between the two variables
Scatter-plots allow political scientists to identify :
    •   Positive or negative relationships
    •   Monotonic or linear relationships


3.2.3 REGRESSION EQUATION
The linear equation is specified as follows:
                          Y = a + bX
        Where Y = dependent variable
                 X = independent variable
                 a = constant (value of Y when X = 0)
                 b = is the slope of the regression line
“a” can be positive or negative. Referred to “a” as the intercept, “a” is the point at which the slope line passes
through the Y axis.
“b” (the slope coefficient) can be positive or negative. A positive coefficient denotes a positive relationship and
a negative coefficient denotes a negative relationship.
The significant interpretation of the slope coefficient depends on the variables involved, how they are coded
and the dimension of the variables. Larger coefficients may indicate a solid relationship, but not necessarily.



Term paper - Inferential Statistics                                                                                  12
Statistical Techniques

The goal of regression analysis is to find an equation which “best fits” the data.


In regression, the equation is found in such a manner such that its graph
is a line that reduces the squared vertical distances amid the data points
and the lines drawn.
“d1” and “d2” illustrate the distances of observed data points from an
approximate regression line.
Regression analysis bring into play a mathematical equation that locates
the single line that reduces the squared distances from the line.
The standard regression equation is the same as the linear equation with
one exception: the error factor.
                         Y = α + βX + ε
Where Y = dependent variable
α = constant term
β = slope or regression coefficient
X = independent variable
ε = error term
This regression process is called ordinary least squares (OLS).
         α (the constant term) interpreted the same as earlier
         β (the regression coefficient) tells how much Y changes if X changes by one unit.
The regression coefficient indicates the inclination and strength of the relationship between the two quantitative
variables. The error (ε) denotes that observed data does not follow a tidy pattern that can be summarized with a
straight line.
A observation's score on Y can be split as the following two parts:
                 α + βX is due to the independent variable
                 ε is due to error
Observed value = Predicted value (α + βX) + error (ε)
The error is the difference between the predicted value of Y and the observed value of Y. This is known as the
residual.




Term paper - Inferential Statistics                                                                            13
Statistical Techniques




Example:
Lets take an example to clarify what we theoretically know:




In above data on the scatterplot:
Y (dependent variable) = telephone lines for 1,000 people
X (independent variable) = Infant mortality
We will utilize regression to look at the relationship connecting communication capacity (measured here as
telephone lines per capita) and infant mortality.
In this example, the intercept and regression coefficient are as follows:
α (or constant) = 121
Means that when X (infant deaths) is 0 deaths, there are 121 phone lines per 1,000 population.
β = -1.25
Means that when X (deaths) increases by 1, there is a predicted or estimated decrease of 1.25 phone lines.



Term paper - Inferential Statistics                                                                          14
Statistical Techniques




3.2.4 REGRESSION INTERPRETATION
These computations can be helpful because they allow us to make useful predictions about the data. For
example, an increase from 1 to 10 mortalities per 1,000 live births is related with a drop of 119.75 – 108.5 =
11.25 telephone lines.
Interpreting the meaning of a coefficient could be a bit fiddly. What does a coefficient of -1.25 mean?
                Well, it means a negative association between infant mortality and phone lines.
                It means for every extra infant death there is a reduction of 1.25 phone lines.
This is useful info, however is there a gauge that tells us how good we do predicting the observed values? Yes,
the measure is known as R-squared.
3.2.5 R SQUARRED
As stated earlier, there are two components of the total deviation from the mean, which is calculated by the
addition of squares (or total variance).The difference between the mean and the predicted value of Y, this is the
explained part of the deviation, or (Regression Sum of Squares).
The second component is the residual sum of squares (Residual Sum of Squares), which measures prediction
errors. The is the unexplained part of the deviation.
                         Total SS = Regression SS + Residual SS
In other words, the total sum of squares is the sum of the regression sum of squares and the residual sum of
squares.
R2 = Regression SS/TSS
The more variance the regression model explains, the higher the R2.




Term paper - Inferential Statistics                                                                              15
Statistical Techniques

                                        4.0 BIBLIOGRAPHY

   1. Inferential statistics Timeline:
      http://www.google.ae/search?q=inferential+statistics&hl=en&tbo=1&rls=com.microsoft:en-
      us:IE-SearchBox&output=search&source=lnt&tbs=tl:1&sa=X&ei=HMsFTq-
      yKInIrQej2LSmDA&ved=0CBEQpwUoAw&biw=1366&bih=596 [Online]. [Accessed: 23th
      June 2011].

   2. Handbook of Injury and Violence Prevention By Lynda S. Doll, E. N. Haas Chapter 9.3 – Brief
       history of youth violence prevention efforts, pg#159




Term paper - Inferential Statistics                                                                 16

Mais conteúdo relacionado

Mais procurados

1.1-1.2 Descriptive and Inferential Statistics
1.1-1.2 Descriptive and Inferential Statistics1.1-1.2 Descriptive and Inferential Statistics
1.1-1.2 Descriptive and Inferential Statistics
mlong24
 
Basic statistics by Neeraj Bhandari ( Surkhet.Nepal )
Basic statistics by Neeraj Bhandari ( Surkhet.Nepal )Basic statistics by Neeraj Bhandari ( Surkhet.Nepal )
Basic statistics by Neeraj Bhandari ( Surkhet.Nepal )
Neeraj Bhandari
 
Chapter 1 introduction to statistics for engineers 1 (1)
Chapter 1 introduction to statistics for engineers 1 (1)Chapter 1 introduction to statistics for engineers 1 (1)
Chapter 1 introduction to statistics for engineers 1 (1)
abfisho
 
probability and statistics Chapter 1 (1)
probability and statistics Chapter 1 (1)probability and statistics Chapter 1 (1)
probability and statistics Chapter 1 (1)
abfisho
 

Mais procurados (20)

Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to Statistics
 
Introduction to Elementary statistics
Introduction to Elementary statisticsIntroduction to Elementary statistics
Introduction to Elementary statistics
 
Statistics
StatisticsStatistics
Statistics
 
Basic Statistics
Basic  StatisticsBasic  Statistics
Basic Statistics
 
Ses 1 basic fundamentals of mathematics and statistics
Ses 1 basic fundamentals of mathematics and statisticsSes 1 basic fundamentals of mathematics and statistics
Ses 1 basic fundamentals of mathematics and statistics
 
Introduction to statistics
Introduction to statisticsIntroduction to statistics
Introduction to statistics
 
Importance of statistics in chemistry
Importance of statistics in chemistryImportance of statistics in chemistry
Importance of statistics in chemistry
 
The use of statistics in outcomes assessment
The use of statistics in outcomes assessmentThe use of statistics in outcomes assessment
The use of statistics in outcomes assessment
 
1.1-1.2 Descriptive and Inferential Statistics
1.1-1.2 Descriptive and Inferential Statistics1.1-1.2 Descriptive and Inferential Statistics
1.1-1.2 Descriptive and Inferential Statistics
 
Basic statistics by Neeraj Bhandari ( Surkhet.Nepal )
Basic statistics by Neeraj Bhandari ( Surkhet.Nepal )Basic statistics by Neeraj Bhandari ( Surkhet.Nepal )
Basic statistics by Neeraj Bhandari ( Surkhet.Nepal )
 
Probability and statistics
Probability and statisticsProbability and statistics
Probability and statistics
 
Chapter 1 introduction to statistics for engineers 1 (1)
Chapter 1 introduction to statistics for engineers 1 (1)Chapter 1 introduction to statistics for engineers 1 (1)
Chapter 1 introduction to statistics for engineers 1 (1)
 
probability and statistics Chapter 1 (1)
probability and statistics Chapter 1 (1)probability and statistics Chapter 1 (1)
probability and statistics Chapter 1 (1)
 
SCOPE, IMPORTANCE & USES OF STATISTICS
SCOPE, IMPORTANCE & USES OF STATISTICS      SCOPE, IMPORTANCE & USES OF STATISTICS
SCOPE, IMPORTANCE & USES OF STATISTICS
 
Social Statistics
Social StatisticsSocial Statistics
Social Statistics
 
Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to Statistics
 
What is Statistics
What is StatisticsWhat is Statistics
What is Statistics
 
Statistics...
Statistics...Statistics...
Statistics...
 
Measures of Descriptive statistics and Inferential statistics
Measures of Descriptive statistics and Inferential statistics Measures of Descriptive statistics and Inferential statistics
Measures of Descriptive statistics and Inferential statistics
 
Introduction concepts of Statistics
Introduction concepts of StatisticsIntroduction concepts of Statistics
Introduction concepts of Statistics
 

Destaque

Inferential statistics powerpoint
Inferential statistics powerpointInferential statistics powerpoint
Inferential statistics powerpoint
kellula
 
Inferential statistics.ppt
Inferential statistics.pptInferential statistics.ppt
Inferential statistics.ppt
Nursing Path
 
Inferential statistics (2)
Inferential statistics (2)Inferential statistics (2)
Inferential statistics (2)
rajnulada
 
descriptive and inferential statistics
descriptive and inferential statisticsdescriptive and inferential statistics
descriptive and inferential statistics
Mona Sajid
 
Data analysis powerpoint
Data analysis powerpointData analysis powerpoint
Data analysis powerpoint
jamiebrandon
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
kemdoby
 
1a difference between inferential and descriptive statistics (explanation)
1a difference between inferential and descriptive statistics (explanation)1a difference between inferential and descriptive statistics (explanation)
1a difference between inferential and descriptive statistics (explanation)
Ken Plummer
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
Aiden Yeh
 

Destaque (20)

Inferential statistics powerpoint
Inferential statistics powerpointInferential statistics powerpoint
Inferential statistics powerpoint
 
Basic Concepts of Inferential statistics
Basic Concepts of Inferential statisticsBasic Concepts of Inferential statistics
Basic Concepts of Inferential statistics
 
Inferential Statistics
Inferential StatisticsInferential Statistics
Inferential Statistics
 
Inferential statistics.ppt
Inferential statistics.pptInferential statistics.ppt
Inferential statistics.ppt
 
Understanding inferential statistics
Understanding inferential statisticsUnderstanding inferential statistics
Understanding inferential statistics
 
Inferential statistics (2)
Inferential statistics (2)Inferential statistics (2)
Inferential statistics (2)
 
descriptive and inferential statistics
descriptive and inferential statisticsdescriptive and inferential statistics
descriptive and inferential statistics
 
Data analysis powerpoint
Data analysis powerpointData analysis powerpoint
Data analysis powerpoint
 
Inferential Statistics
Inferential StatisticsInferential Statistics
Inferential Statistics
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
Chapter 4 presentation of data
Chapter 4 presentation of dataChapter 4 presentation of data
Chapter 4 presentation of data
 
Presentation of data
Presentation of dataPresentation of data
Presentation of data
 
Chapter 10-DATA ANALYSIS & PRESENTATION
Chapter 10-DATA ANALYSIS & PRESENTATIONChapter 10-DATA ANALYSIS & PRESENTATION
Chapter 10-DATA ANALYSIS & PRESENTATION
 
1a difference between inferential and descriptive statistics (explanation)
1a difference between inferential and descriptive statistics (explanation)1a difference between inferential and descriptive statistics (explanation)
1a difference between inferential and descriptive statistics (explanation)
 
Basic Descriptive Statistics
Basic Descriptive StatisticsBasic Descriptive Statistics
Basic Descriptive Statistics
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
 
Data Analysis: Descriptive Statistics
Data Analysis: Descriptive StatisticsData Analysis: Descriptive Statistics
Data Analysis: Descriptive Statistics
 
T test
T testT test
T test
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 

Semelhante a Inferential statistics

ECO-409-class_one.pptxhrhurutgtututu8yhyyyhyt
ECO-409-class_one.pptxhrhurutgtututu8yhyyyhytECO-409-class_one.pptxhrhurutgtututu8yhyyyhyt
ECO-409-class_one.pptxhrhurutgtututu8yhyyyhyt
MisterPhilips
 
Psychometrics ppt
Psychometrics pptPsychometrics ppt
Psychometrics ppt
santhosh357
 
Nash equilibrium and the history of economic theory by roger b ...
Nash equilibrium and the history of economic theory by roger b ...Nash equilibrium and the history of economic theory by roger b ...
Nash equilibrium and the history of economic theory by roger b ...
sugeladi
 
Why have the Loyalists largely been forgotten in history  Do you .docx
Why have the Loyalists largely been forgotten in history  Do you .docxWhy have the Loyalists largely been forgotten in history  Do you .docx
Why have the Loyalists largely been forgotten in history  Do you .docx
alanfhall8953
 
Hans hermann hoppe - a theory of socialism and capitalism
Hans hermann hoppe - a theory of socialism and capitalismHans hermann hoppe - a theory of socialism and capitalism
Hans hermann hoppe - a theory of socialism and capitalism
Daniel Diaz
 
2 hours agoLuke Powell Main Post - Luke PowellCOLLAPSETo.docx
2 hours agoLuke Powell Main Post - Luke PowellCOLLAPSETo.docx2 hours agoLuke Powell Main Post - Luke PowellCOLLAPSETo.docx
2 hours agoLuke Powell Main Post - Luke PowellCOLLAPSETo.docx
vickeryr87
 
Student Name CATEGORY AND CRITERIA(A+)15(A).docx
Student Name CATEGORY AND CRITERIA(A+)15(A).docxStudent Name CATEGORY AND CRITERIA(A+)15(A).docx
Student Name CATEGORY AND CRITERIA(A+)15(A).docx
deanmtaylor1545
 
Student Name CATEGORY AND CRITERIA(A+)15(A).docx
Student Name CATEGORY AND CRITERIA(A+)15(A).docxStudent Name CATEGORY AND CRITERIA(A+)15(A).docx
Student Name CATEGORY AND CRITERIA(A+)15(A).docx
cpatriciarpatricia
 

Semelhante a Inferential statistics (20)

Mmpi scale
Mmpi scaleMmpi scale
Mmpi scale
 
Abbott Q Paper0102
Abbott Q Paper0102Abbott Q Paper0102
Abbott Q Paper0102
 
ECO-409-class_one.pptxhrhurutgtututu8yhyyyhyt
ECO-409-class_one.pptxhrhurutgtututu8yhyyyhytECO-409-class_one.pptxhrhurutgtututu8yhyyyhyt
ECO-409-class_one.pptxhrhurutgtututu8yhyyyhyt
 
Thesis_Spin Doctoring
Thesis_Spin DoctoringThesis_Spin Doctoring
Thesis_Spin Doctoring
 
Revisiting the Matthew effect
Revisiting the Matthew effectRevisiting the Matthew effect
Revisiting the Matthew effect
 
Explanation Essay Examples. Explanation Essay Examples.
Explanation Essay Examples. Explanation Essay Examples.Explanation Essay Examples. Explanation Essay Examples.
Explanation Essay Examples. Explanation Essay Examples.
 
Psychometrics ppt
Psychometrics pptPsychometrics ppt
Psychometrics ppt
 
Status of Statistics
Status of StatisticsStatus of Statistics
Status of Statistics
 
Nash equilibrium and the history of economic theory by roger b ...
Nash equilibrium and the history of economic theory by roger b ...Nash equilibrium and the history of economic theory by roger b ...
Nash equilibrium and the history of economic theory by roger b ...
 
Why have the Loyalists largely been forgotten in history  Do you .docx
Why have the Loyalists largely been forgotten in history  Do you .docxWhy have the Loyalists largely been forgotten in history  Do you .docx
Why have the Loyalists largely been forgotten in history  Do you .docx
 
Hans hermann hoppe - a theory of socialism and capitalism
Hans hermann hoppe - a theory of socialism and capitalismHans hermann hoppe - a theory of socialism and capitalism
Hans hermann hoppe - a theory of socialism and capitalism
 
2 hours agoLuke Powell Main Post - Luke PowellCOLLAPSETo.docx
2 hours agoLuke Powell Main Post - Luke PowellCOLLAPSETo.docx2 hours agoLuke Powell Main Post - Luke PowellCOLLAPSETo.docx
2 hours agoLuke Powell Main Post - Luke PowellCOLLAPSETo.docx
 
Essays On Stereotypes.pdf
Essays On Stereotypes.pdfEssays On Stereotypes.pdf
Essays On Stereotypes.pdf
 
ArticoloInglese
ArticoloIngleseArticoloInglese
ArticoloInglese
 
Student Name CATEGORY AND CRITERIA(A+)15(A).docx
Student Name CATEGORY AND CRITERIA(A+)15(A).docxStudent Name CATEGORY AND CRITERIA(A+)15(A).docx
Student Name CATEGORY AND CRITERIA(A+)15(A).docx
 
Safety Essay.pdf
Safety Essay.pdfSafety Essay.pdf
Safety Essay.pdf
 
Surrogate Science: How Fisher, Neyman-Pearson, and Bayes Were Transformed int...
Surrogate Science: How Fisher, Neyman-Pearson, and Bayes Were Transformed int...Surrogate Science: How Fisher, Neyman-Pearson, and Bayes Were Transformed int...
Surrogate Science: How Fisher, Neyman-Pearson, and Bayes Were Transformed int...
 
Student Name CATEGORY AND CRITERIA(A+)15(A).docx
Student Name CATEGORY AND CRITERIA(A+)15(A).docxStudent Name CATEGORY AND CRITERIA(A+)15(A).docx
Student Name CATEGORY AND CRITERIA(A+)15(A).docx
 
History of Inddustrial Psychology
History of Inddustrial PsychologyHistory of Inddustrial Psychology
History of Inddustrial Psychology
 
Good Cause And Effect Essay. 100 Important Cause and Effect Essay Topics
Good Cause And Effect Essay. 100 Important Cause and Effect Essay TopicsGood Cause And Effect Essay. 100 Important Cause and Effect Essay Topics
Good Cause And Effect Essay. 100 Important Cause and Effect Essay Topics
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Último (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Inferential statistics

  • 1. Statistical Techniques Inferential Statistics Term paper - Inferential Statistics 1
  • 2. Statistical Techniques Table of Contents 1.0 WHAT IS INFERENTIAL STATISTICS?.................................................................................................3 2.0 BRIEF TIMELINE OF INFERENTIAL STATISTICS .............................................................................4 3.0 TEST OF WHAT? ..........................................................................................................................................6 3.1 HYPOTHESIS – INTRODUCTION ...............................................................................................6 3.1.2 ERRORS IN SAMPLING.................................................................................................7 3.1.3 STUDENT’s T-TEST ........................................................................................................9 3.1.4 CHI-SQUARE TEST.......................................................................................................10 3.2 REGRESSION? ...............................................................................................................................12 3.2.1 REGRESSION MODELS...............................................................................................12 3.2.2 SCATTER-PLOTS ..........................................................................................................12 3.2.3 REGRESSION EQUATION ..........................................................................................12 3.2.4 REGRESSION INTERPRETATION............................................................................15 3.2.5 R SQUARRED .................................................................................................................15 4.0 BIBLIOGRAPHY.........................................................................................................................................16 Term paper - Inferential Statistics 2
  • 3. Statistical Techniques 1.0 WHAT IS INFERENTIAL STATISTICS? The vital key to the difference between descriptive and inferential statistics are the capitalized words in the description: CAN DESCRIBE, COULD NOT CONCLUDE, AND REPRESENTATIVE OF. Descriptive statistics can only describe the actual sample you study. But to extend your conclusions to a broader population, like all such classes, all workers, all women, you must use inferential statistics, which means you have to be sure the sample you study is representative of the group you want to generalize to. Allow me to exemplify: i. The study at the local mall and cannot be used to claim that what you find is valid for all shoppers and all malls. ii. Another example would be a study conducted on an intermediate college can’t claim that what you find is valid for the colleges of all levels (i.e. General Population). iii. Also visualize a survey conducted at a women's club that includes a majority of a particular single ethnic group cannot claim that what you find is valid for women for all ethnic groups. As you can see, descriptive statistics are useful and serviceable if you don't need to extend your results to whole segments of the population. But the social sciences tend to esteem studies that give us more or less "universal" truths, or at least truths that apply to large segments of the population, like all teenagers, all parents, all women, all perpetrators, all victims, or a fairly large segment of such groups. Leaving aside the theoretical and mechanical soundness of such an investigation for some kind of broad conclusion, various statistical approaches are to be utilized if one aspires to generalize. And the primary distinction is that of SAMPLING. One must choose a sample that is REPRESENTATIVE OF THE GROUP TO WHICH YOU PLAN TO GENERALIZE. To round up, Descriptive statistics are for describing data on the group you study, While Inferential statistics are for generalizing your findings to a broader population group. Term paper - Inferential Statistics 3
  • 4. Statistical Techniques 2.0 BRIEF TIMELINE OF INFERENTIAL STATISTICS 1733 1733 - In the 1700s, it was Thomas Bayer who gave birth to the concept of inferential statistics. The normal distribution was discovered in 1733 by a Huguenot refugee de Moivre as an approximation to the binomial distribution when the number of trials is too large. Today, not only do scientists but also many professions rely on statistics to understand behaviour and ideally make predictions about what circumstances relate to or cause these behaviours. 1796 Historical Note: In 1796, Adophe Quetelet investigated the characteristics of French conscripts to determine the "average man." Florence Nightingale was so influenced by Quetelet's work that she began collecting and analyzing medical records in the military hospitals during the Crimean War. Based on her work hospitals began keeping accurate records on their patients, to provide better follow-up care. 1894 1894 - At the inception of the social survey, research results were confronted with the developments in inferential statistics. In 1894, Booth wrote The Aged Poor in England and Wales: Conditions? In this volume Booth claimed that there was no relationship between the ratio of welfare (out-of-doors relief) and workhouse relief (in-relief and the incidence of poverty by parish (or poor law union). 1896 Dec 1896 - Walker died rather suddenly at the age of 56, just days after giving the address opening the first meeting of ASA outside Boston—in Washington, DC in December 1896. That meeting led to the founding of the Washington Statistical Society. His achievements in developing major federal data systems, in promoting the organizational development of statistics, and of bringing statistical ideas to a wide audience, left the field much richer than he found it. 1899 1899 - Since inferential social statistics are primarily concerned with correlation and regression. To prove this Yule published his paper on poverty in London in 1899, this concern has occurred in a context of establishing causality. Often investigators seem to view statistical modeling as being equivalent to a regression model. The reader is cautioned that my critique of regression analysis is not necessarily equivalent to denying the value of empirical research. Term paper - Inferential Statistics 4
  • 5. Statistical Techniques 1925 1925 – First, Sigmund Freud had developed a theory that self-explained the reasons for aggression and juvenile criminal behaviours in terms of childhood experiences. Second, in 1925, RA Fisher published Statistical Methods for Research Workers in which he identified an effective experimental paradigm that included control groups and inferential statistics. Freud's theory and Fisher's paradigm provided a basis so that mental health professionals could initiate studies to identify many mental behaviours 1930 IN 1930, THE YEAR the CH Stoelting Co. of Chicago published what was to be the largest- ever catalog of psychological apparatus, there was virtually no use of inferential statistics in psychology, in spite of the fact that William Sealey Cosset had long since presented the T-test and Sir Ronald Fisher had presented the general logic of null hypothesis testing. Only after Fisher's epochal introduction to analysis of variance procedures did psychologists even notice the procedure. 1930 1930 - The fiducial argument, which Fisher produced in 1930, generated much controversy and did not survive the death of its creator. Fisher created many terms in everyday use, eg statistic and sampling distribution and so there are many references to his work on the Words pages. Symbols in Statistics are his contributions to notation. 1935 1935 - In the two decades following the publication of Ronald Aylmer Fisher's Design of Experiments in 1935, Fisher's link between experimental design and inferential statistics became institutionalized in American experimental psychology. 1936 Apr 27, 1936 - . Pearson founded the journal Biometrics and was the editor of Annals of Eugenics. Because of his fundamental work in the development of modern statistics, many scholars today regard Pearson as the founder of 20th-century statistics. He died in Coldharbour, England, on April 27, 1936. 1977 1977 - The youth violence prevention landscape has changed drastically in the last quarter century. In 1977, Wright and Dixon published a review of “Juveniles delinquency prevention program” reports. The results were disappointing. From approximately 6600 program abstracts, empirical data were available from only 96 . Of the 96 empirical reports, only 9 used random assignment of subjects, inferential statistics, outcomes measure of delinquency, and a follow-up period of at least six months. Of those 9, only 3 reported positive outcomes, and these three were based on the three smallest sample sizes among the 9 reports. The authors concluded that the literature was low in both scientific and policy utility. By contrast today dozens of summaries of research on prevention practices are available. Term paper - Inferential Statistics 5
  • 6. Statistical Techniques 1981 Jun 3, 1981 - Education Practical Statistics for Educators An introduction to the basic ideas of descriptive and inferential statistics as applied to the work of the classroom teachers counselors and administrators In the public schools Emphasis Is upon practical applications of statistics to problems. 1986 Dec 17, 1986 - Koop acknowleged that the proof of these smoker’s deaths was "inferential, of course," based on analyses of statistics gathered in past studies, including several in Japan, Hong Kong, Taiwan, Europe and the United States. 1995 Jan 1995 - A jury trial on compensatory damages was held in January 1995. Dannemiller testified that the selection of the random sample met the standards of inferential statistics, that the successful efforts to locate and obtain testimony from the claimants in the random sample "were of the highest standards " in his profession, that the procedures followed conformed to the standards of inferential statistics. 3.0 TEST OF WHAT? Tests of significance are helpful in problems of generalization. A Chi-Square or a T-Test tells you the probability that the results you found in the group under study represent the population of the chosen group. It can be frequently observed, Chi-Square or a t-test gives you the probability that the results found could have occurred by chance when there is really no relationship at all between the variables you studied in the population. A known method used in inferential statistics is estimation. In estimation, the sample is used to estimate a parameter, and a confidence interval about the estimate is constructed. Other examples of inferential statistics methods include i. Hypothesis testing ii. Linear regression 3.1 HYPOTHESIS – INTRODUCTION Hyptothesis is a statement about the population parameter or about a population distribution. The testing of hypothesisis conducted in two phases. In the first phase, a test is designed where we decide as to when can the null hypothesis be rejected. In the second phase, the designed test is used to draw the conclusion. Hypothesis testing is to test some hypothesis about parent population from which the sample is drawn. DEFINITIONS PARAMETER - The statistical constants of the population namely mean (µ) , variance are usually referred to as parameters. STATISTIC - Statistical measures computed from the sample observations alone namely mean X Variance S2 have been termed as Statistic. Term paper - Inferential Statistics 6
  • 7. Statistical Techniques UNBIASED ESTMATE - A statistic t = t (X1, X2, …..Xn), a function of the sample values X1, X2,…….Xn is an unbiased estimate of the population parameter 0, if E(t) = 0. In other words, if E(Statistic) = Paramater, then statistic is said to be an unbiased estimate of the parameter. SAMPLING DISTRIBUTION OF A STATISTIC - If we draw a sample of size n from a given finite population of size N, then the total number of possible samples is /n!(N-n)! = K STANDARD ERROR - The standard deviation of the sampling distribution of a statistic is known as it standard error. NULL HYPOTHESIS - A definite statement about the population parameter which is usually a hypothesis of no difference is called Null Hypothesis an is usually denoted by Ho ALTERNATIVE HYPOTHESIS - Any hypothesis which is complementary to the null hypothesis is called an alternative hypothesis usually denoted by H1. For example, if we want to test the null hypothesis that the population has a specified mean Mo (say) is Ho : µ - µo then the alternative hypothesis could be a) H1 : µ ≠ µ o b) H1 : µ > µ o c) H1 : µ < µ o 3.1.1 PROCEDURE FOR TESTING OF HYPOTHESIS Various steps in testing of a statistical hypothesis in a systematic manner : 1. Null hypothesis : Set up the null hypothesis H0 2. Alternative Hypothesis : Set up the alternative hypothesis H1. This will be enable us to decide whether we have to use a single tailed(right or left) test of two-tailed test. 3. Level of Significance : To choose the appropriate level of significance (x) 4. Test Statistic : To compute the test statistic : Z = t-E(t)/S1E(t) , under Ho 5. Conclusion : We compare the computed value of Z with the significant value Z2, at the given level of significance, if │z │ <z2 we say it not significant, it │z│ >z is then we say that it is significant and the null hypothesis is rejected at level of significance. 3.1.2 ERRORS IN SAMPLING The main objective in sampling theory is to draw valid inference about the population parameters on the basis of the sample results. In practice, we decide to accept or reject the lot after examining a sample from it. As such we are able to commit the following two types of errors : TYPE 1 ERROR : Reject Ho when it is true, TYPE II ERROR : Accept Ho when it is wrong, ie. Accept Ho when H1 is true If we mention P(Accept Ho when it is wrong) = P(Accept Ho/H1) = β and P(Reject Ho when it is true) = P(Reject Ho/H1) = x then 2 and β are called the sizes of type 1 error and type II error, respectively. In practice, type I error amounts to rejecting a lot when it is good and type II error may be Term paper - Inferential Statistics 7
  • 8. Statistical Techniques regarded as accepting the lot when it is bad. Thus P(Reject a lot when it is good) = α and P (Accept a lot when it is bad) = β where α and β are referred to as producer’s risk and consumer’s risk respectively. CRITICAL REGION A region in the sample splace S which amounts to rejection of Ho is termed as critical region of rejection. ONE – TAILED AND TWO-TAILED TESTS Ho: µ > µ o (Right-tailed), the critical region lies entirely in the right tail H1: µ < µ o (left-tailed), the critical region lies entirely in the left tail. A test of statistical hypothesis where the alternative hypothesis is two – tailed tests such as Ho:µ=µ o against the alternative hypothesis H1:µ=µ o isknown as two tailed test and in such a case the critical region is given by the portion of the area lying on both tailsof the probability curve of the test statistic. CRITICAL VALUE OR SIGNIFICANT VALUES The value of test statistic which separates the critical (or rejection) region and the acceptance region is called the critical value or significant value. It depends on : 1) The level of significance used, and 2) The alternative hypothesis, whether it is two-tailed of single-tailed The standardized variable corresponding to the statistic t namely Z = The value of z above under the null hypothesis is known as test statistic. The critical value of the test statistic at level of significance 2 for a two-tailed test is given by Z, where Z is determined by the equation : P(1Z1>Z o ) = α i.e., Zα is the value so that the total area of the critical region on both tails is 2. Since normal probability curve is a symmetrical curve. In case of a single-tail alternative, the critical value of Zα is determined so that total area to the right of it (for right-tailed test) is α and for left-tailed test the total area to the left of (-Zα) is α Thus the significant or critical value of Z for a single-tailed test (left or right) at level of significance α is same as the critical value of Z for a two-tailed test at level of significance ‘α’. Please find below the critical values of Z at commonly used levels of significance for both two-tailed and single-tailed tests Critical Value Z2 LEVEL OF SIGNIFICANCE 1% 5% 10% Two tailed test │Zα│ = 2.58 │Zα│ = 1.96 │Zα│ = 1.645 Right tailed test Zα = 2.33 Zα = 1.645 Zα = 1.28 Left tailed test Zα = 2.33 Zα = 1.645 Zα = 1.28 TEST OF SIGNIFICANCE OF A SINGLE MEAN If X1, X2, …….Xn, in a random sample of size n from a normal population with mean M and variance 2, then the sample mean is distributed normally with mean M and variance . Term paper - Inferential Statistics 8
  • 9. Statistical Techniques Null Hypothesis, Ho - The sample has been drawn from a population with mean M and variance , ie there is no significance difference between the sample mean(X) and population mean(M), the test statistic (for large samples), is Z = If the population Standard Deviation is unknown, then we use its estimate provided by the sample variance given by (for large samples) TEST OF SIGNIFICANCE FOR DIFFERENCE OF MEANS The mean of random sample of size n, from a population with Mean M, and Variance and let be the mean of an independent random sample of size n2 from another population with mean M2 and variance ? then, since sample size are large. TEST OF SIGNIFICANCE FOR THE DIFFERENCE OF STANDARD DEVIATION If S1 and S2 are the standard deviation of two independent samples, then under null hypothesis, Ho : 1= 2 i.e the sample standard deviations don’t differ significantly. 1) (for large samples) But in case of large samples, the S.E of the difference of the sample standard deviations is given by 3.1.3 STUDENT’s T-TEST The entire large sample theory was based on the application of “normal test”. However if the sample size n is small, the distribution of the various statistics are far from normally and as such ‘normal test’ cannot be applied if n is mall. In such cases exact sample tests, pioneered by W.S.Gosst(1908) who wrote under the pen name-of student, and later on developed and extended by Prof.R.A.Fisher(1926) are used. Applications Of T-Distribution The t-distribution has a wide number of applications in statistics, and some of which are 1) To test if the sample mean( ) differs significantly from the hypothetical value µ of the population mean. 2) To test the significance of the difference between two sample means. 3) To test the significance of an observed sample correlation and sample regression coefficient. 4) To test the significance of observed partial correlation coefficient. T-Test For Single Mean All hypothesis testing is done under the assumption the null hypothesis is true Population Standard Deviation Known If the population standard deviation, sigma, is known, then the population mean has a normal distribution, and you will be using the z-score formula for sample means. The test Term paper - Inferential Statistics 9
  • 10. Statistical Techniques statistic is the standard formula you've seen before. The critical value is obtained from the normal table, or the bottom line from the t-table. Population Standard Deviation Unknown If the population standard deviation, sigma, is unknown, then the population mean has a student's T-distribution, and you will be using the t-score formula for sample means. The test statistic is very similar to that for the z-score, except that sigma has been replaced by s and z has been replaced by t. The critical value is obtained from the t-table. The degree of freedom for this test is n-1. If you're performing a t-test where you found the statistics on the calculator (as opposed to being given them in the problem), then use the VARS key to pull up the statistics in the calculation of the test statistic. This will save you data entry and avoid round off errors. General Pattern Notice the general pattern of these test statistics is (observed - expected) / standard deviation. 3.1.4 CHI-SQUARE TEST A chi-square test (also chi squared test or χ2 test) is any statistical hypothesis test in which the sampling distribution of the test statistic is a chi-square distribution when the null hypothesis is true, or any in which this is asymptotically true, meaning that the sampling distribution (if the null hypothesis is true) can be made to approximate a chi-square distribution as closely as desired by making the sample size large enough. Chi-Square Test In Contigency Table CHI-SQUARE distribution is utlised to determine the critical value of the chi-square variate at various level of significance. Properties : (1) The value of chi-square varies from 0 to α. (2) When each Oi = Ei, the value of chi-square is zero. (3) Chi-square can never be negative Term paper - Inferential Statistics 10
  • 11. Statistical Techniques CONTIGENCY TABLE : The test if independence of attributes when the frequencies are presented in a two way table according to two attributes classified to various categories known as the contigency table. Test of hypothesis in a contingency table. A contingency is a rectangular array having rows and colums ascertaining to the categories of the attributes of A & B. The null hypothesis : H0 : Two attributes are independent vs H1 : two attributes are dependant on each other . Statistics X2 has (p-1) (q-1) d.f Under Ho, the indepdendence of attributes, the expected frequency, Eij = ith row total x jith column N = Ri x Cj n Decision : The calculated value compared with tabulated value of X2 for (P-1) (Q-1) d.f. & prefixed level of significance α. Calculation X2 > reject Ho, if Calculation < X2 – tab – accept Ho. CONTIGENCY TABLE OF ORDER 2X2 DIRECT FORMULAR FOR 2x2 = n(ad-bc)2 (a+b) (c+d) (a+c) (b+d) X2 has 1 d.f. B1 B2 A1 A B a+b A2 C (cell) D c+d a+c B+d a+b+c+d = n Calculation X2>X2α1, reject Ho Calculation X2<x2α1, accept Ho Term paper - Inferential Statistics 11
  • 12. Statistical Techniques 3.2 REGRESSION? Regression analysis is a method for determining the relationship between two variables. The regression statistical skeleton is at the core of observed social and political science research. Regression analysis works as a statistical substitute for controlled experiments, and can be used to make causal inferences. 3.2.1 REGRESSION MODELS Researchers render verbal theories, hypothesis, even intuition into models. A model illustrates how and under what circumstances two (or more) variables are linked. A regression model with a dependent variable and one independent variable is known as a bi-variate regression model. A regression model with a dependent variable and two or more independent variables and/or control variables is known as a multivariate regression model. Example: The dataset "Televisions, Doctors, and Smokers" contains, among other variables, the number of smokers per television set and the number of smokers per physician for 50 countries. 3.2.2 SCATTER-PLOTS The X axis normally depicts the values of the independent variable, while the Y axis represents the value of the dependent variable. Scatter-plots allow you to study the flow of the dots, or the relationship between the two variables Scatter-plots allow political scientists to identify : • Positive or negative relationships • Monotonic or linear relationships 3.2.3 REGRESSION EQUATION The linear equation is specified as follows: Y = a + bX Where Y = dependent variable X = independent variable a = constant (value of Y when X = 0) b = is the slope of the regression line “a” can be positive or negative. Referred to “a” as the intercept, “a” is the point at which the slope line passes through the Y axis. “b” (the slope coefficient) can be positive or negative. A positive coefficient denotes a positive relationship and a negative coefficient denotes a negative relationship. The significant interpretation of the slope coefficient depends on the variables involved, how they are coded and the dimension of the variables. Larger coefficients may indicate a solid relationship, but not necessarily. Term paper - Inferential Statistics 12
  • 13. Statistical Techniques The goal of regression analysis is to find an equation which “best fits” the data. In regression, the equation is found in such a manner such that its graph is a line that reduces the squared vertical distances amid the data points and the lines drawn. “d1” and “d2” illustrate the distances of observed data points from an approximate regression line. Regression analysis bring into play a mathematical equation that locates the single line that reduces the squared distances from the line. The standard regression equation is the same as the linear equation with one exception: the error factor. Y = α + βX + ε Where Y = dependent variable α = constant term β = slope or regression coefficient X = independent variable ε = error term This regression process is called ordinary least squares (OLS). α (the constant term) interpreted the same as earlier β (the regression coefficient) tells how much Y changes if X changes by one unit. The regression coefficient indicates the inclination and strength of the relationship between the two quantitative variables. The error (ε) denotes that observed data does not follow a tidy pattern that can be summarized with a straight line. A observation's score on Y can be split as the following two parts: α + βX is due to the independent variable ε is due to error Observed value = Predicted value (α + βX) + error (ε) The error is the difference between the predicted value of Y and the observed value of Y. This is known as the residual. Term paper - Inferential Statistics 13
  • 14. Statistical Techniques Example: Lets take an example to clarify what we theoretically know: In above data on the scatterplot: Y (dependent variable) = telephone lines for 1,000 people X (independent variable) = Infant mortality We will utilize regression to look at the relationship connecting communication capacity (measured here as telephone lines per capita) and infant mortality. In this example, the intercept and regression coefficient are as follows: α (or constant) = 121 Means that when X (infant deaths) is 0 deaths, there are 121 phone lines per 1,000 population. β = -1.25 Means that when X (deaths) increases by 1, there is a predicted or estimated decrease of 1.25 phone lines. Term paper - Inferential Statistics 14
  • 15. Statistical Techniques 3.2.4 REGRESSION INTERPRETATION These computations can be helpful because they allow us to make useful predictions about the data. For example, an increase from 1 to 10 mortalities per 1,000 live births is related with a drop of 119.75 – 108.5 = 11.25 telephone lines. Interpreting the meaning of a coefficient could be a bit fiddly. What does a coefficient of -1.25 mean? Well, it means a negative association between infant mortality and phone lines. It means for every extra infant death there is a reduction of 1.25 phone lines. This is useful info, however is there a gauge that tells us how good we do predicting the observed values? Yes, the measure is known as R-squared. 3.2.5 R SQUARRED As stated earlier, there are two components of the total deviation from the mean, which is calculated by the addition of squares (or total variance).The difference between the mean and the predicted value of Y, this is the explained part of the deviation, or (Regression Sum of Squares). The second component is the residual sum of squares (Residual Sum of Squares), which measures prediction errors. The is the unexplained part of the deviation. Total SS = Regression SS + Residual SS In other words, the total sum of squares is the sum of the regression sum of squares and the residual sum of squares. R2 = Regression SS/TSS The more variance the regression model explains, the higher the R2. Term paper - Inferential Statistics 15
  • 16. Statistical Techniques 4.0 BIBLIOGRAPHY 1. Inferential statistics Timeline: http://www.google.ae/search?q=inferential+statistics&hl=en&tbo=1&rls=com.microsoft:en- us:IE-SearchBox&output=search&source=lnt&tbs=tl:1&sa=X&ei=HMsFTq- yKInIrQej2LSmDA&ved=0CBEQpwUoAw&biw=1366&bih=596 [Online]. [Accessed: 23th June 2011]. 2. Handbook of Injury and Violence Prevention By Lynda S. Doll, E. N. Haas Chapter 9.3 – Brief history of youth violence prevention efforts, pg#159 Term paper - Inferential Statistics 16