SlideShare uma empresa Scribd logo
1 de 35
Statistics for Quantitative Analysis
• Statistics: Set of mathematical tools used to describe
and make judgments about data
• Type of statistics we will talk about in this class has
important assumption associated with it:
Experimental variation in the population from which samples
are drawn has a normal (Gaussian, bell-shaped) distribution.
- Parametric vs. non-parametric statistics
CHM 235 – Dr. Skrabal
Normal distribution
• Infinite members of group:
population
• Characterize population by taking
samples
• The larger the number of samples,
the closer the distribution becomes to
normal
• Equation of normal distribution:
22
2/)(
2
1 σµ
πσ
−−
= x
ey
Normal distribution
• Estimate of mean value
of population = µ
• Estimate of mean value
of samples =
Mean =
x
n
x
x i
i∑
=
Normal distribution
• Degree of scatter (measure of
central tendency) of population is
quantified by calculating the
standard deviation
• Std. dev. of population = σ
• Std. dev. of sample = s
• Characterize sample by calculating
1
)( 2
−
−∑
=
n
xx
s
i
i
sx ±
Standard deviation and the
normal distribution
• Standard deviation
defines the shape of the
normal distribution
(particularly width)
• Larger std. dev. means
more scatter about the
mean, worse precision.
• Smaller std. dev. means
less scatter about the
mean, better precision.
Standard deviation and the
normal distribution
• There is a well-defined relationship
between the std. dev. of a population
and the normal distribution of the
population:
∀ µ ± 1σ encompasses 68.3 % of
measurements
∀ µ ± 2σ encompasses 95.5% of
measurements
∀ µ ± 3σ encompasses 99.7% of
measurements
• (May also consider these
percentages of area under the curve)
Example of mean and standard
deviation calculation
Consider Cu data: 5.23, 5.79, 6.21, 5.88, 6.02 nM
= 5.826nM  5.82 nM
s = 0.368 nM  0.36 nM
Answer: 5.82 ± 0.36 nM or 5.8 ± 0.4 nM
Learn how to use the statistical functions on your
calculator. Do this example by longhand calculation
once, and also by calculator to verify that you’ll get
exactly the same answer. Then use your calculator
for all future calculations.
x
Relative standard deviation (rsd)
or coefficient of variation (CV)
rsd or CV =
From previous example,
rsd = (0.36 nM/5.82 nM) 100 = 6.1% or 6%
100





x
s
Standard error
• Tells us that standard deviation of set of samples should
decrease if we take more measurements
• Standard error =
• Take twice as many measurements, s decreases by
• Take 4x as many measurements, s decreases by
• There are several quantitative ways to determine the sample
size required to achieve a desired precision for various statistical
applications. Can consult statistics textbooks for further
information; e.g. J.H. Zar, Biostatistical Analysis
n
s
sx =
4.12 ≈
24 =
Variance
Used in many other statistical calculations and tests
Variance = s2
From previous example, s = 0.36
s2
= (0.36)2
= 0. 129 (not rounded because it is usually
used in further calculations)
Average deviation
• Another way to express
degree of scatter or
uncertainty in data. Not as
statistically meaningful as
standard deviation, but
useful for small samples.
Using previous data:
n
xx
d i
i∑ −
=
)(
5
8.502.68.588.58.521.68.579.58.523.5 22222 −+−+−+−+−
=d
nMord 2.02.025.0 5→=
nMornMAnswer 2.08.52.08.5: 52 ±±
Relative average deviation (RAD)
Using previous data,
RAD = (0. 25/5.82) 100 = 4.2 or 4%
RAD = (0. 25/5.82) 1000 = 42 ppt
 4.2 x 101
or 4 x 101
ppt (0
/00)
),(1000
)(100
pptthousandperpartsas
x
d
RAD
percentageas
x
d
RAD






=






=
Some useful statistical tests
• To characterize or make judgments about data
• Tests that use the Student’s t distribution
– Confidence intervals
– Comparing a measured result with a “known”
value
– Comparing replicate measurements (comparison
of means of two sets of data)
From D.C. Harris (2003) Quantitative Chemical Analysis, 6th
Ed.
Confidence intervals
• Quantifies how far the true mean (µ) lies from the
measured mean, . Uses the mean and standard
deviation of the sample.
where t is from the t-table and n = number of
measurements.
Degrees of freedom (df) = n - 1 for the CI.
n
ts
x ±=µ
x
Example of calculating a
confidence interval
Consider measurement of dissolved Ti
in a standard seawater (NASS-3):
Data: 1.34, 1.15, 1.28, 1.18, 1.33, 1.65,
1.48 nM
DF = n – 1 = 7 – 1 = 6
= 1.34 nM or 1.3 nM
s = 0.17 or 0.2 nM
95% confidence interval
t(df=6,95%) = 2.447
CI95 = 1.3 ± 0.16 or 1.3 ± 0.2 nM
50% confidence interval
t(df=6,50%) = 0.718
CI50 = 1.3 ± 0.05 nM
x
n
ts
x ±=µ
Interpreting the confidence interval
• For a 95% CI, there is a 95% probability that the true
mean (µ) lies between the range 1.3 ± 0.2 nM, or
between 1.1 and 1.5 nM
• For a 50% CI, there is a 50% probability that the true
mean lies between the range 1.3± 0.05 nM, or
between 1.25 and 1.35 nM
• Note that CI will decrease as n is increased
• Useful for characterizing data that are regularly
obtained; e.g., quality assurance, quality control
Comparing a measured result
with a “known” value
• “Known” value would typically be a certified value
from a standard reference material (SRM)
• Another application of the t statistic
Will compare tcalc to tabulated value of t at appropriate df
and CL.
df = n -1 for this test
n
s
xvalueknown
tcalc
−
=
Comparing a measured result
with a “known” value--example
Dissolved Fe analysis verified using NASS-3 seawater SRM
Certified value = 5.85 nM
Experimental results: 5.76 ± 0.17 nM (n = 10)
(Keep 3 decimal places for comparison to table.)
Compare to ttable; df = 10 - 1 = 9, 95% CL
ttable(df=9,95% CL) = 2.262
If |tcalc| < ttable, results are not significantly different at the 95% CL.
If |tcalc| ≥ ttable, results are significantly different at the 95% CL.
For this example, tcalc < ttest, so experimental results are not significantly
different at the 95% CL
674.110
1.0
7.585.5
7
6
=
−
=
−
= n
s
xvalueknown
tcalc
Comparing replicate measurements or
comparing means of two sets of data
• Yet another application of the t statistic
• Example: Given the same sample analyzed by two
different methods, do the two methods give the “same”
result?
Will compare tcalc to tabulated value of t at appropriate df
and CL.
df = n1 + n2 – 2 for this test
21
2121
nn
nn
s
xx
t
pooled
calc
+
−
=
2
)1()1(
21
2
2
21
2
1
−+
−+−
=
nn
nsns
spooled
Comparing replicate measurements
or comparing means of two sets of
data—example
Method 1: Atomic absorption
spectroscopy
Data: 3.91, 4.02, 3.86, 3.99 mg/g
= 3.945 mg/g
= 0.073 mg/g
= 4
Method 2: Spectrophotometry
Data: 3.52, 3.77, 3.49, 3.59 mg/g
= 3.59 mg/g
= 0.12 mg/g
= 4
1x
Determination of nickel in sewage
sludge
using two different methods
2x
1s 2s
1n 2n
Comparing replicate measurements or
comparing means of two sets of data—
example
0993.0
244
)14()1.0()14()07.0(
2
)1()1( 2
2
2
3
21
2
2
21
2
1
=
−+
−+−
=
−+
−+−
=
nn
nsns
spooled
056.5
44
)4)(4(
0993.0
5.394.3 95
21
2121
=
+
−
=
+
−
=
nn
nn
s
xx
t
pooled
calc
Note: Keep 3 decimal places to compare to ttable.
Compare to ttable at df = 4 + 4 – 2 = 6 and 95% CL.
ttable(df=6,95% CL) = 2.447
If |tcalc| < ttable, results are not significantly different at the 95%. CL.
If |tcalc| ≥ ttable, results are significantly different at the 95% CL.
Since |tcalc| (5.056) ≥ ttable (2.447), results from the two methods are
significantly different at the 95% CL.
Comparing replicate measurements or
comparing means of two sets of data
Wait a minute! There is an important assumption
associated with this t-test:
It is assumed that the standard deviations (i.e., the
precision) of the two sets of data being compared are not
significantly different.
• How do you test to see if the two std. devs. are different?
• How do you compare two sets of data whose std. devs.
are significantly different?
F-test to compare standard
deviations
• Used to determine if std. devs. are significantly
different before application of t-test to compare
replicate measurements or compare means of two
sets of data
• Also used as a simple general test to compare the
precision (as measured by the std. devs.) of two sets
of data
• Uses F distribution
F-test to compare standard
deviations
Will compute Fcalc and compare to Ftable.
DF = n1 - 1 and n2 - 1 for this test.
Choose confidence level (95% is a typical CL).
212
2
2
1
sswhere
s
s
Fcalc >=
From D.C. Harris (2003) Quantitative Chemical Analysis, 6th Ed.
F-test to compare standard deviations
From previous example:
Let s1 = 0.12 and s2 = 0.073
Note: Keep 2 or 3 decimal places to compare with Ftable.
Compare Fcalc to Ftable at df = (n1 -1, n2 -1) = 3,3 and 95% CL.
If Fcalc < Ftable, std. devs. are not significantly different at 95% CL.
If Fcalc ≥ Ftable, std. devs. are significantly different at 95% CL.
Ftable(df=3,3;95% CL) = 9.28
Since Fcalc (2.70) < Ftable (9.28), std. devs. of the two sets of data are
not significantly different at the 95% CL. (Precisions are
similar.)
70.2
)07.0(
)1.0(
2
3
2
2
2
2
2
1
===
s
s
Fcalc
Comparing replicate measurements or
comparing means of two sets of data--
revisited
The use of the t-test for comparing means was justified
for the previous example because we showed that
standard deviations of the two sets of data were not
significantly different.
If the F-test shows that std. devs. of two sets of data
are significantly different and you need to compare
the means, use a different version of the t-test 
Comparing replicate measurements or
comparing means from two sets of data
when std. devs. are significantly
different
2
1
)/(
1
)/(
)//(
//
2
2
2
2
2
1
2
1
2
1
2
2
2
21
2
1
2
2
21
2
1
21
−




















+
+
+
+
=
+
−
=
n
ns
n
ns
nsns
DF
nsns
xx
tcalc
Flowchart for comparing means of two
sets of data or replicate
measurements
Use F-test to see if std.
devs. of the 2 sets of
data are significantly
different or not
Std. devs. are
significantly different
Std. devs. are not
significantly different
Use the 2nd
version
of the t-test (the
beastly version)
Use the 1st
version of the
t-test (see previous, fully
worked-out example)
One last comment on the F-test
Note that the F-test can be used to simply test whether
or not two sets of data have statistically similar
precisions or not.
Can use to answer a question such as: Do method one
and method two provide similar precisions for the
analysis of the same analyte?
Evaluating questionable data
points using the Q-test
• Need a way to test questionable data points (outliers) in an
unbiased way.
• Q-test is a common method to do this.
• Requires 4 or more data points to apply.
Calculate Qcalc and compare to Qtable
Qcalc = gap/range
Gap = (difference between questionable data pt. and its
nearest neighbor)
Range = (largest data point – smallest data point)
Evaluating questionable data
points using the Q-test--example
Consider set of data; Cu values in sewage sample:
9.52, 10.7, 13.1, 9.71, 10.3, 9.99 mg/L
Arrange data in increasing or decreasing order:
9.52, 9.71, 9.99, 10.3, 10.7, 13.1
The questionable data point (outlier) is 13.1
Calculate
Compare Qcalc to Qtable for n observations and desired CL (90% or
95% is typical). It is desirable to keep 2-3 decimal places in Qcalc
so judgment from table can be made.
Qtable (n=6,90% CL) = 0.56
670.0
)52.91.13(
)7.101.13(
=
−
−
==
range
gap
Qcalc
From G.D. Christian (1994) Analytical Chemistry, 5th
Ed.
Evaluating questionable data points
using the Q-test--example
If Qcalc < Qtable, do not reject questionable data point at stated CL.
If Qcalc ≥ Qtable, reject questionable data point at stated CL.
From previous example,
Qcalc (0.670) > Qtable (0.56), so reject data point at 90% CL.
Subsequent calculations (e.g., mean and standard deviation)
should then exclude the rejected point.
Mean and std. dev. of remaining data: 10.04 ± 0.47 mg/L

Mais conteúdo relacionado

Mais procurados

Dalut ppt. of factorial analysis of variance-b
Dalut ppt. of factorial analysis of variance-bDalut ppt. of factorial analysis of variance-b
Dalut ppt. of factorial analysis of variance-bIan Kris Lastimosa
 
Statistics and probability
Statistics and probabilityStatistics and probability
Statistics and probabilityShahwarKhan16
 
Measure of dispersion
Measure of dispersionMeasure of dispersion
Measure of dispersionAnil Pokhrel
 
Solution manual for design and analysis of experiments 9th edition douglas ...
Solution manual for design and analysis of experiments 9th edition   douglas ...Solution manual for design and analysis of experiments 9th edition   douglas ...
Solution manual for design and analysis of experiments 9th edition douglas ...Salehkhanovic
 
Applied statistics and probability for engineers solution montgomery && runger
Applied statistics and probability for engineers solution   montgomery && rungerApplied statistics and probability for engineers solution   montgomery && runger
Applied statistics and probability for engineers solution montgomery && rungerAnkit Katiyar
 
Statistics-Measures of dispersions
Statistics-Measures of dispersionsStatistics-Measures of dispersions
Statistics-Measures of dispersionsCapricorn
 
Penggambaran Data Secara Numerik
Penggambaran Data Secara NumerikPenggambaran Data Secara Numerik
Penggambaran Data Secara Numerikanom1392
 
Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )
Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )
Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )Neeraj Bhandari
 
Measures of Dispersion: Standard Deviation and Co- efficient of Variation
Measures of Dispersion: Standard Deviation and Co- efficient of Variation Measures of Dispersion: Standard Deviation and Co- efficient of Variation
Measures of Dispersion: Standard Deviation and Co- efficient of Variation RekhaChoudhary24
 
Reweighting and Boosting to uniforimty in HEP
Reweighting and Boosting to uniforimty in HEPReweighting and Boosting to uniforimty in HEP
Reweighting and Boosting to uniforimty in HEParogozhnikov
 
Randomization Tests
Randomization Tests Randomization Tests
Randomization Tests Ajay Dhamija
 
Applied Statistics and Probability for Engineers 6th Edition Montgomery Solut...
Applied Statistics and Probability for Engineers 6th Edition Montgomery Solut...Applied Statistics and Probability for Engineers 6th Edition Montgomery Solut...
Applied Statistics and Probability for Engineers 6th Edition Montgomery Solut...qyjewyvu
 
The commonly used measures of absolute dispersion are: 1. Range 2. Quartile D...
The commonly used measures of absolute dispersion are: 1. Range 2. Quartile D...The commonly used measures of absolute dispersion are: 1. Range 2. Quartile D...
The commonly used measures of absolute dispersion are: 1. Range 2. Quartile D...Muhammad Amir Sohail
 
MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4arogozhnikov
 

Mais procurados (19)

3.2 Measures of variation
3.2 Measures of variation3.2 Measures of variation
3.2 Measures of variation
 
Dalut ppt. of factorial analysis of variance-b
Dalut ppt. of factorial analysis of variance-bDalut ppt. of factorial analysis of variance-b
Dalut ppt. of factorial analysis of variance-b
 
Statistics and probability
Statistics and probabilityStatistics and probability
Statistics and probability
 
Measure of dispersion
Measure of dispersionMeasure of dispersion
Measure of dispersion
 
Solution manual for design and analysis of experiments 9th edition douglas ...
Solution manual for design and analysis of experiments 9th edition   douglas ...Solution manual for design and analysis of experiments 9th edition   douglas ...
Solution manual for design and analysis of experiments 9th edition douglas ...
 
Sec 3.1 measures of center
Sec 3.1 measures of center  Sec 3.1 measures of center
Sec 3.1 measures of center
 
Applied statistics and probability for engineers solution montgomery && runger
Applied statistics and probability for engineers solution   montgomery && rungerApplied statistics and probability for engineers solution   montgomery && runger
Applied statistics and probability for engineers solution montgomery && runger
 
Statistics-Measures of dispersions
Statistics-Measures of dispersionsStatistics-Measures of dispersions
Statistics-Measures of dispersions
 
Penggambaran Data Secara Numerik
Penggambaran Data Secara NumerikPenggambaran Data Secara Numerik
Penggambaran Data Secara Numerik
 
Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )
Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )
Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )
 
3.1 Measures of center
3.1 Measures of center3.1 Measures of center
3.1 Measures of center
 
Measures of Variation
Measures of VariationMeasures of Variation
Measures of Variation
 
Measures of Dispersion: Standard Deviation and Co- efficient of Variation
Measures of Dispersion: Standard Deviation and Co- efficient of Variation Measures of Dispersion: Standard Deviation and Co- efficient of Variation
Measures of Dispersion: Standard Deviation and Co- efficient of Variation
 
Reweighting and Boosting to uniforimty in HEP
Reweighting and Boosting to uniforimty in HEPReweighting and Boosting to uniforimty in HEP
Reweighting and Boosting to uniforimty in HEP
 
Randomization Tests
Randomization Tests Randomization Tests
Randomization Tests
 
Applied Statistics and Probability for Engineers 6th Edition Montgomery Solut...
Applied Statistics and Probability for Engineers 6th Edition Montgomery Solut...Applied Statistics and Probability for Engineers 6th Edition Montgomery Solut...
Applied Statistics and Probability for Engineers 6th Edition Montgomery Solut...
 
The commonly used measures of absolute dispersion are: 1. Range 2. Quartile D...
The commonly used measures of absolute dispersion are: 1. Range 2. Quartile D...The commonly used measures of absolute dispersion are: 1. Range 2. Quartile D...
The commonly used measures of absolute dispersion are: 1. Range 2. Quartile D...
 
Measure of Dispersion in statistics
Measure of Dispersion in statisticsMeasure of Dispersion in statistics
Measure of Dispersion in statistics
 
MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4
 

Semelhante a Statistics

Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or VarianceEstimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or VarianceLong Beach City College
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance Long Beach City College
 
Measures of Dispersion
Measures of DispersionMeasures of Dispersion
Measures of DispersionKainatIqbal7
 
3.3 Measures of relative standing and boxplots
3.3 Measures of relative standing and boxplots3.3 Measures of relative standing and boxplots
3.3 Measures of relative standing and boxplotsLong Beach City College
 
GC-S005-DataAnalysis
GC-S005-DataAnalysisGC-S005-DataAnalysis
GC-S005-DataAnalysishenry kang
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or VarianceEstimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or VarianceLong Beach City College
 
Role of estimation in statistics
Role of estimation in statisticsRole of estimation in statistics
Role of estimation in statisticsNadeem Uddin
 
Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Con...
Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Con...Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Con...
Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Con...nszakir
 
Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf
Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdfDr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf
Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdfHassanMohyUdDin2
 
Chi-square, Yates, Fisher & McNemar
Chi-square, Yates, Fisher & McNemarChi-square, Yates, Fisher & McNemar
Chi-square, Yates, Fisher & McNemarAzmi Mohd Tamil
 
METHOD OF DISPERSION to upload.pptx
METHOD OF DISPERSION to upload.pptxMETHOD OF DISPERSION to upload.pptx
METHOD OF DISPERSION to upload.pptxSreeLatha98
 
Lecture 11 Paired t test.pptx
Lecture 11 Paired t test.pptxLecture 11 Paired t test.pptx
Lecture 11 Paired t test.pptxshakirRahman10
 
Statistical process control (spc)
Statistical process control (spc)Statistical process control (spc)
Statistical process control (spc)Dinah Faye Indino
 

Semelhante a Statistics (20)

Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or VarianceEstimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance
 
Measures of Dispersion
Measures of DispersionMeasures of Dispersion
Measures of Dispersion
 
9618821.ppt
9618821.ppt9618821.ppt
9618821.ppt
 
9618821.pdf
9618821.pdf9618821.pdf
9618821.pdf
 
Makalah ukuran penyebaran
Makalah ukuran penyebaranMakalah ukuran penyebaran
Makalah ukuran penyebaran
 
3.3 Measures of relative standing and boxplots
3.3 Measures of relative standing and boxplots3.3 Measures of relative standing and boxplots
3.3 Measures of relative standing and boxplots
 
Two Means, Independent Samples
Two Means, Independent SamplesTwo Means, Independent Samples
Two Means, Independent Samples
 
GC-S005-DataAnalysis
GC-S005-DataAnalysisGC-S005-DataAnalysis
GC-S005-DataAnalysis
 
Measures of Variation
Measures of Variation Measures of Variation
Measures of Variation
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or VarianceEstimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance
 
Statistical test
Statistical testStatistical test
Statistical test
 
Role of estimation in statistics
Role of estimation in statisticsRole of estimation in statistics
Role of estimation in statistics
 
Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Con...
Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Con...Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Con...
Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Con...
 
Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf
Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdfDr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf
Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf
 
Chi-square, Yates, Fisher & McNemar
Chi-square, Yates, Fisher & McNemarChi-square, Yates, Fisher & McNemar
Chi-square, Yates, Fisher & McNemar
 
METHOD OF DISPERSION to upload.pptx
METHOD OF DISPERSION to upload.pptxMETHOD OF DISPERSION to upload.pptx
METHOD OF DISPERSION to upload.pptx
 
Measures of Variation.pdf
Measures of Variation.pdfMeasures of Variation.pdf
Measures of Variation.pdf
 
Lecture 11 Paired t test.pptx
Lecture 11 Paired t test.pptxLecture 11 Paired t test.pptx
Lecture 11 Paired t test.pptx
 
Statistical process control (spc)
Statistical process control (spc)Statistical process control (spc)
Statistical process control (spc)
 

Último

Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 

Último (20)

Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 

Statistics

  • 1. Statistics for Quantitative Analysis • Statistics: Set of mathematical tools used to describe and make judgments about data • Type of statistics we will talk about in this class has important assumption associated with it: Experimental variation in the population from which samples are drawn has a normal (Gaussian, bell-shaped) distribution. - Parametric vs. non-parametric statistics CHM 235 – Dr. Skrabal
  • 2. Normal distribution • Infinite members of group: population • Characterize population by taking samples • The larger the number of samples, the closer the distribution becomes to normal • Equation of normal distribution: 22 2/)( 2 1 σµ πσ −− = x ey
  • 3. Normal distribution • Estimate of mean value of population = µ • Estimate of mean value of samples = Mean = x n x x i i∑ =
  • 4. Normal distribution • Degree of scatter (measure of central tendency) of population is quantified by calculating the standard deviation • Std. dev. of population = σ • Std. dev. of sample = s • Characterize sample by calculating 1 )( 2 − −∑ = n xx s i i sx ±
  • 5. Standard deviation and the normal distribution • Standard deviation defines the shape of the normal distribution (particularly width) • Larger std. dev. means more scatter about the mean, worse precision. • Smaller std. dev. means less scatter about the mean, better precision.
  • 6. Standard deviation and the normal distribution • There is a well-defined relationship between the std. dev. of a population and the normal distribution of the population: ∀ µ ± 1σ encompasses 68.3 % of measurements ∀ µ ± 2σ encompasses 95.5% of measurements ∀ µ ± 3σ encompasses 99.7% of measurements • (May also consider these percentages of area under the curve)
  • 7. Example of mean and standard deviation calculation Consider Cu data: 5.23, 5.79, 6.21, 5.88, 6.02 nM = 5.826nM  5.82 nM s = 0.368 nM  0.36 nM Answer: 5.82 ± 0.36 nM or 5.8 ± 0.4 nM Learn how to use the statistical functions on your calculator. Do this example by longhand calculation once, and also by calculator to verify that you’ll get exactly the same answer. Then use your calculator for all future calculations. x
  • 8. Relative standard deviation (rsd) or coefficient of variation (CV) rsd or CV = From previous example, rsd = (0.36 nM/5.82 nM) 100 = 6.1% or 6% 100      x s
  • 9. Standard error • Tells us that standard deviation of set of samples should decrease if we take more measurements • Standard error = • Take twice as many measurements, s decreases by • Take 4x as many measurements, s decreases by • There are several quantitative ways to determine the sample size required to achieve a desired precision for various statistical applications. Can consult statistics textbooks for further information; e.g. J.H. Zar, Biostatistical Analysis n s sx = 4.12 ≈ 24 =
  • 10. Variance Used in many other statistical calculations and tests Variance = s2 From previous example, s = 0.36 s2 = (0.36)2 = 0. 129 (not rounded because it is usually used in further calculations)
  • 11. Average deviation • Another way to express degree of scatter or uncertainty in data. Not as statistically meaningful as standard deviation, but useful for small samples. Using previous data: n xx d i i∑ − = )( 5 8.502.68.588.58.521.68.579.58.523.5 22222 −+−+−+−+− =d nMord 2.02.025.0 5→= nMornMAnswer 2.08.52.08.5: 52 ±±
  • 12. Relative average deviation (RAD) Using previous data, RAD = (0. 25/5.82) 100 = 4.2 or 4% RAD = (0. 25/5.82) 1000 = 42 ppt  4.2 x 101 or 4 x 101 ppt (0 /00) ),(1000 )(100 pptthousandperpartsas x d RAD percentageas x d RAD       =       =
  • 13. Some useful statistical tests • To characterize or make judgments about data • Tests that use the Student’s t distribution – Confidence intervals – Comparing a measured result with a “known” value – Comparing replicate measurements (comparison of means of two sets of data)
  • 14. From D.C. Harris (2003) Quantitative Chemical Analysis, 6th Ed.
  • 15. Confidence intervals • Quantifies how far the true mean (µ) lies from the measured mean, . Uses the mean and standard deviation of the sample. where t is from the t-table and n = number of measurements. Degrees of freedom (df) = n - 1 for the CI. n ts x ±=µ x
  • 16. Example of calculating a confidence interval Consider measurement of dissolved Ti in a standard seawater (NASS-3): Data: 1.34, 1.15, 1.28, 1.18, 1.33, 1.65, 1.48 nM DF = n – 1 = 7 – 1 = 6 = 1.34 nM or 1.3 nM s = 0.17 or 0.2 nM 95% confidence interval t(df=6,95%) = 2.447 CI95 = 1.3 ± 0.16 or 1.3 ± 0.2 nM 50% confidence interval t(df=6,50%) = 0.718 CI50 = 1.3 ± 0.05 nM x n ts x ±=µ
  • 17. Interpreting the confidence interval • For a 95% CI, there is a 95% probability that the true mean (µ) lies between the range 1.3 ± 0.2 nM, or between 1.1 and 1.5 nM • For a 50% CI, there is a 50% probability that the true mean lies between the range 1.3± 0.05 nM, or between 1.25 and 1.35 nM • Note that CI will decrease as n is increased • Useful for characterizing data that are regularly obtained; e.g., quality assurance, quality control
  • 18. Comparing a measured result with a “known” value • “Known” value would typically be a certified value from a standard reference material (SRM) • Another application of the t statistic Will compare tcalc to tabulated value of t at appropriate df and CL. df = n -1 for this test n s xvalueknown tcalc − =
  • 19. Comparing a measured result with a “known” value--example Dissolved Fe analysis verified using NASS-3 seawater SRM Certified value = 5.85 nM Experimental results: 5.76 ± 0.17 nM (n = 10) (Keep 3 decimal places for comparison to table.) Compare to ttable; df = 10 - 1 = 9, 95% CL ttable(df=9,95% CL) = 2.262 If |tcalc| < ttable, results are not significantly different at the 95% CL. If |tcalc| ≥ ttable, results are significantly different at the 95% CL. For this example, tcalc < ttest, so experimental results are not significantly different at the 95% CL 674.110 1.0 7.585.5 7 6 = − = − = n s xvalueknown tcalc
  • 20. Comparing replicate measurements or comparing means of two sets of data • Yet another application of the t statistic • Example: Given the same sample analyzed by two different methods, do the two methods give the “same” result? Will compare tcalc to tabulated value of t at appropriate df and CL. df = n1 + n2 – 2 for this test 21 2121 nn nn s xx t pooled calc + − = 2 )1()1( 21 2 2 21 2 1 −+ −+− = nn nsns spooled
  • 21. Comparing replicate measurements or comparing means of two sets of data—example Method 1: Atomic absorption spectroscopy Data: 3.91, 4.02, 3.86, 3.99 mg/g = 3.945 mg/g = 0.073 mg/g = 4 Method 2: Spectrophotometry Data: 3.52, 3.77, 3.49, 3.59 mg/g = 3.59 mg/g = 0.12 mg/g = 4 1x Determination of nickel in sewage sludge using two different methods 2x 1s 2s 1n 2n
  • 22. Comparing replicate measurements or comparing means of two sets of data— example 0993.0 244 )14()1.0()14()07.0( 2 )1()1( 2 2 2 3 21 2 2 21 2 1 = −+ −+− = −+ −+− = nn nsns spooled 056.5 44 )4)(4( 0993.0 5.394.3 95 21 2121 = + − = + − = nn nn s xx t pooled calc Note: Keep 3 decimal places to compare to ttable. Compare to ttable at df = 4 + 4 – 2 = 6 and 95% CL. ttable(df=6,95% CL) = 2.447 If |tcalc| < ttable, results are not significantly different at the 95%. CL. If |tcalc| ≥ ttable, results are significantly different at the 95% CL. Since |tcalc| (5.056) ≥ ttable (2.447), results from the two methods are significantly different at the 95% CL.
  • 23. Comparing replicate measurements or comparing means of two sets of data Wait a minute! There is an important assumption associated with this t-test: It is assumed that the standard deviations (i.e., the precision) of the two sets of data being compared are not significantly different. • How do you test to see if the two std. devs. are different? • How do you compare two sets of data whose std. devs. are significantly different?
  • 24. F-test to compare standard deviations • Used to determine if std. devs. are significantly different before application of t-test to compare replicate measurements or compare means of two sets of data • Also used as a simple general test to compare the precision (as measured by the std. devs.) of two sets of data • Uses F distribution
  • 25. F-test to compare standard deviations Will compute Fcalc and compare to Ftable. DF = n1 - 1 and n2 - 1 for this test. Choose confidence level (95% is a typical CL). 212 2 2 1 sswhere s s Fcalc >=
  • 26. From D.C. Harris (2003) Quantitative Chemical Analysis, 6th Ed.
  • 27. F-test to compare standard deviations From previous example: Let s1 = 0.12 and s2 = 0.073 Note: Keep 2 or 3 decimal places to compare with Ftable. Compare Fcalc to Ftable at df = (n1 -1, n2 -1) = 3,3 and 95% CL. If Fcalc < Ftable, std. devs. are not significantly different at 95% CL. If Fcalc ≥ Ftable, std. devs. are significantly different at 95% CL. Ftable(df=3,3;95% CL) = 9.28 Since Fcalc (2.70) < Ftable (9.28), std. devs. of the two sets of data are not significantly different at the 95% CL. (Precisions are similar.) 70.2 )07.0( )1.0( 2 3 2 2 2 2 2 1 === s s Fcalc
  • 28. Comparing replicate measurements or comparing means of two sets of data-- revisited The use of the t-test for comparing means was justified for the previous example because we showed that standard deviations of the two sets of data were not significantly different. If the F-test shows that std. devs. of two sets of data are significantly different and you need to compare the means, use a different version of the t-test 
  • 29. Comparing replicate measurements or comparing means from two sets of data when std. devs. are significantly different 2 1 )/( 1 )/( )//( // 2 2 2 2 2 1 2 1 2 1 2 2 2 21 2 1 2 2 21 2 1 21 −                     + + + + = + − = n ns n ns nsns DF nsns xx tcalc
  • 30. Flowchart for comparing means of two sets of data or replicate measurements Use F-test to see if std. devs. of the 2 sets of data are significantly different or not Std. devs. are significantly different Std. devs. are not significantly different Use the 2nd version of the t-test (the beastly version) Use the 1st version of the t-test (see previous, fully worked-out example)
  • 31. One last comment on the F-test Note that the F-test can be used to simply test whether or not two sets of data have statistically similar precisions or not. Can use to answer a question such as: Do method one and method two provide similar precisions for the analysis of the same analyte?
  • 32. Evaluating questionable data points using the Q-test • Need a way to test questionable data points (outliers) in an unbiased way. • Q-test is a common method to do this. • Requires 4 or more data points to apply. Calculate Qcalc and compare to Qtable Qcalc = gap/range Gap = (difference between questionable data pt. and its nearest neighbor) Range = (largest data point – smallest data point)
  • 33. Evaluating questionable data points using the Q-test--example Consider set of data; Cu values in sewage sample: 9.52, 10.7, 13.1, 9.71, 10.3, 9.99 mg/L Arrange data in increasing or decreasing order: 9.52, 9.71, 9.99, 10.3, 10.7, 13.1 The questionable data point (outlier) is 13.1 Calculate Compare Qcalc to Qtable for n observations and desired CL (90% or 95% is typical). It is desirable to keep 2-3 decimal places in Qcalc so judgment from table can be made. Qtable (n=6,90% CL) = 0.56 670.0 )52.91.13( )7.101.13( = − − == range gap Qcalc
  • 34. From G.D. Christian (1994) Analytical Chemistry, 5th Ed.
  • 35. Evaluating questionable data points using the Q-test--example If Qcalc < Qtable, do not reject questionable data point at stated CL. If Qcalc ≥ Qtable, reject questionable data point at stated CL. From previous example, Qcalc (0.670) > Qtable (0.56), so reject data point at 90% CL. Subsequent calculations (e.g., mean and standard deviation) should then exclude the rejected point. Mean and std. dev. of remaining data: 10.04 ± 0.47 mg/L