SlideShare uma empresa Scribd logo
1 de 15
Advanced Engineering Statistics-IE 5318- Fall 2008-Project1



Report:

Project Proposal:

          Fe           Na2SO3
                                  The main objective of this project is to develop a model for
ppm                      %w
                                  the concentration of sodium sulfite with the concentration
          1.8            17.2     of Iron present in waste water. We have collected waste
                                  water samples on each day for 23 days so our number of the
          1.0            16.2
                                  observation is 23. Performed lab analysis (ICP) to find out
          0.8            20.7     Fe and Sodium sulfite from plant production data.

          1.1            23.3

          1.6            18.4

          0.8            20.9

          1.1            22.8

          1.6            17.7

          2.6            21.2

          1.2            20.8

          1.4            22.2

          1.4            12.7

          1.4            18.1

          2.0            16.3

          2.8            16.3

          4.1            19.2

          3.6            20.1

          2.0            21.7

          1.9            22.3

          6.0            19.3

          8.9            18.7

       11.2              14.9

          0.9            15.1
Calculations of yhat and Residuals (e)
Simple Linear Regression:

In Simple Linear Regression we take the Na2 So3 in the Y-Axis which forms the Response or
Dependent Variable Vs the Fe in the X-axis which forms the Factor or the predictor variable
which is Independent Variable.
Using the SAS 9.1 version we generate the above graph
                                                          and data set as shown below




                                              Calculation of MSE, b1, b0.

∑x   i   = 61.2      ∑x   2
                          i   = 317.82           Sx = 2.75949
                                                                                     b0          =         Y − b1 X =
                                                                                     19.63779




∑y   i   = 436.1      ∑y      2
                              i   = 8446.39     Sy =
0.96691

X =2.660           Y =18.960

                                                                                 n                   n         n

                                                                MSE =     ∧ 2   ∑y        2
                                                                                          i   − b0 ∑ y i − b1 ∑ xi y i
                                                                        σ =     i =1               i =1       i =1
                                                                                                                         =
                                                                                                     n−2
                                                                7.97739
The Model of SLR:

                          The above equation is explained in as given below.

    Yi = β0 + β1Xi + εi


 β0 is the y-intercept
 β1 is the slope.
 The random error term εi = (equation error + measurement error) has a mean E {εi} = 0
 The constant variance V {εi} = σ2
 ε is uncorrelated or co-variance Cov(εi, εj) = 0 for all i, j, i ≠ j.
 i = 1, ……, n


Regression Line Fit:

The Regression line fit can be explained as follows:

We find the linearity between the Na2So3 (Response) and Fe (Predictor) by finding the unknown
parameters of β0 and β1 with the values of b0 and b1 respectively.

From our class notes the estimated Regression function is expressed as:
^
Therefore substituting the valuesYof bb0+andi b1
                                  i
                                    = 0 b1X               obtained by us from our SAS in the above
equation we get



                                       ^
                                      Yi = 19.63779-0.2544*Xi




                                                         ∧
In our case we, there is a linearity associated with y i and xi


Inferences on our Parameters:

For a 95% Confidence Interval in our case for the β1

The Formula for the Confidence Interval is


                     α         
  C.I = b1 ± t 1 −     ; n − 2  s{b1 }
                     2         



From our SAS Output we get

To find σ 2

MSE = (RMSE) 2

     = (2.82443)2

     = 7.97739



                      2
        n      
       ∑ x          2
        i =1 i  = S x (n-1)
  n

∑ xi − n
i =1
     2




                          = (2.75949)2 (22)= 167.5252
σ2
                               2
                    n                 7.97739
s{b1}=              ∑ xi         =            = 0.22688
              n                        167.5252
            ∑ xi −  i =1n 
            i =1
                 2




At α = 0.05

⇒ C.I = 19.63779 ± t ( 0.975,21) * 0.22688 ⇒ C.I = 19.63779 ± (2.080) * 0.22688

⇒ C.I = (19.1658, 20.1097)

Therefore we conclude that we are 95% confident that the percentage of Na2 So3 increases
between our obtained range of 19.1658 and 20.1097 for each unit increase in Fe.

Model Fit: Hypothesis test for slope:

The usage of T-test helps us find the linear relationship between Fe and Na2 So3.The t* value is
obtained from SAS output and we can try matching it with the t-cut off value which is obtained
from the t-distribution table.

T-test for β1

Test: H0: β1 = 0                            α = 0.05

          H1: β1 ≠ 0

Our Decision Rule is as follows:

           α          
If t* > t 1 − ; n − 2  → reject H0; Else, Fail to Reject H0
             2        

        b1
t* =
       s{b1 }

       19.63779
⇒               ⇒ 86.555
        0.22688

   α          
t 1 − ; n − 2  = t(.975,21) = 2.080
     2        

        α          
t* > t 1 − ; n − 2  As per our Decision rule we reject H0
          2        
This above decision of ours make us state that we are 95% Confident that our Na2 So3 and
Fe has a Linear Relationship.

Confidence Interval for Y-intercept:

⇒ 100(1-α)% CI for β0.

                                                               _2
Two sided: b0 ± t(1-α/2; n-2) s{b0} ⇒ s {b0}=     MSE{1 / n + X / ∑ () 2

= 0.84339

Applying two sided tests =19.63779± t(0.975,21) *0.84339=(17.8835,21.392)

Therefore we conclude that we are 95% confident that the percentage of Na2 So3 increases
between our obtained range of 17.8835 and 21.392 for each unit increase in Fe.

Analysis of Variance:

From our SAS OUTPUT,

Regression Sum of Squares (SSR) = 10.02968

Error Sum of Squares (SSE) = 167.5251

Total Sum of Squares (SSTO) =177.55478

Regression Mean square (MSR) = 10.02968

Error Mean Square (MSE) = 7.97739

Coefficient of Determination (R2) = 0.0565

Coefficient of Determination:

R2 = SSR/SSTO = 1-(SSE/SSTO) =0.0565

0 ≤ R2 ≤ 1. It measures the extent to which the regression model fits the data line.

Coefficient of Correlation

r=±      R2

     Since the slope ( ) in our model is positive we only consider the positive value of r.

r=     R2

r = 0.2377 ≈ 1
⇒ There is a strong linear relation between the Na2 So3 and the Fe.

ANOVA Table:



Source                 DF             SS                 SS/DF            F           p-value

Regression             1           10.02968           10.02968           1.26             0.2784
Error                  21          167.5251           7.97739
Total                  22          177.55478
Confidence Interval for Mean Response:

1. 95% Confidence Interval for xh= 10




             s{   }    =


                       =    1.76638

            s{    }    = 1.76638

                            α          
             C.I =     ± t 1 − ; n − 2  * s{    }
                              2        

              = 196.1235 ± (0.975,21)*1.76638 = 196.1235± 2.08*1.76638 ⇒ C.I for xh=10 =
(192.4494, 199.7975)

⇒ We are thus 95% confident that the mean value of the probability distribution of Na2 S03 lies
between 192.4494 and 199.7975 when the xh=10 ppm of Fe.

2.      95% Prediction Interval for xh=10


s{pred} =

                                                                α          
           = 3.33128                             P.I =     ± t 1 − ; n − 2  * s{pred}
                                                                  2        

                                                      =196.1235 ± (.975,21)*3.33128
=196.1235 ± 6.929 ⇒ P.I for xh =10 =
(189.1945, 203.0525)

⇒ We predict with 95% confidence that the actual ppm of Fe obtained when the xh=10 ppm of
Fe lies between 189.1235 and 203.0525.

3. 95% Confidence Band for xh= 10

         ± w s{   }

                  Where       = 2F (1-α/2,2,n-2)

                              =2*F (0.95, 2, 21)

                              =2*19.425

                              = 38.85

                              = 6.2329~6.233

                          C.B = 196.1235 ± 6.233*1.76638

                              = 196.1235 ± 11.0098

                             =(185.1137,207.1333)

                       ⇒ C.B for xh=10 = (185.1137, 207.1333)



  ⇒ We have 95% confidence that true regression line lies certainly between the upper and
lower band of the CB. So for xh=10 of Fe obtained will lie between 185.1137 and 207.1333



From the above results we infer that CB is wider than CI

     =                        >                      = , which is always true. The Confidence
Band limits for several xh along the range of x:
Using the data from this table we plot the confidence band with the fitted regression line
and data points:




Residual Analysis:

The residual analysis is to verify the following model assumptions

         •   A Linear model is reasonable-Mandatory.
         •   The residual have constant Variance -Mandatory.
         •   The residuals are normally distributed -Optional.
         •   The residuals are uncorrelated -Mandatory.
         •   Model is free from outliers –Optional.



Plots:

Linearity Analysis:

Residual Plot against the predicted variable determines whether a linear regression function is
appropriate for our data. From our Plot we infer that
The points are randomly scattered and hence the linearity is OK and there is NO funnel
shape observation which makes our model have a constant variance with no outliers.

Time Series plot:

Time series data often arise when monitoring industrial processes or tracking corporate
business metrics. Time series analysis accounts for the fact that data points taken over time
may have an internal structure (such as autocorrelation, trend or seasonal variation) that
should be accounted for.

Referred from: http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc4.htm



From our SAS Output we Infer our Time Series Plot as Follows:
The Time Series plot has no Significance in our Analysis, since our data is not based on
time.

Normality Analysis:

Normality check is done to check whether the residuals are normally distributed, which is one of
the desired assumption for simple linear regression model




                                                             .

Inference from the graph:

The Graph looks pretty straight. Normality seems to be Ok with slight S with shorter tails on
either ends.

We can do a normality test to further make our graph analysis clearer.

Normality test:

H0 : Normality is OK

H1 : Normality is Violated.

Our Decision Rule will be :

If   < c(α,n) → Reject H0 else we fail to H0

Take α=0.05; c(0.1,23)=0.964      =0.97415 (from SAS output corresponding ENRM VALUE)

⇒ Since      > c(α,n) ⇒ Normality is OK.
Modified Levene Test for Variances:

TEST: H0 : Means are Equal

       H1 : Means are not equal




P=0.6576, α=.05 ⇒ P > α we fail to reject H0 Means are Equal

Equal Variance Test:

TEST: H0 : σd1 = σd2 Variance is Constant

       H1 : σd1 ≠ σd2 Variance is not constant

P=0.1990; α= 0.05 ⇒ P > α We Fail to Reject H0    Variance is Constant.

From Modified Levene test, we conclude Constant Variance is constant, which is in
accordance to what we have observed from the Plot Residual Vs Yhat and Residual Vs x.
So we need not do any Transformations.
Conclusion:

From all the above tests performed we conclude the following:

There is a linear relation between the Na2So3(Response) and Fe(Predictor).

The fitted regression line in our model is represented by the equation:
         ^
        Y = 19.637739 -0.2544 x
         i


 The T-test further proves that there is a linear regression to relate Na2So3 to Fe. The R value (
   =0.0565) in our model indicates that there is a good fit and it explains everything in estimating
the Na2 So3 considering the Fe as the predictor variable.

We took the significance level as α=0.05 to conduct all our tests and our confidence level is
95% for all conclusions.
We calculated the Confidence intervals for the intercept of the regression function and also we
calculated the Confidence interval, Prediction interval and Confidence band for a given
value of ( =10). From the above mentioned calculations, we found that the prediction interval
is wider than the Confidence Interval for the same Confidence level of 95%.We also found that
the Confidence band is wider than the Confidence Interval.

We did Calculate the ANOVA table and found the Degrees of freedom and their corresponding
SSR,SSE,SSTO, MSR,MSE, F*,P-value which gave us the relationship between the Na2So3
and Fe.

We also did find out that the Variances of the error terms were normal and constant and also did
the residual analysis and found that the linearity was Ok with no funnel shape which attributed to
the constant variance. This was further supported by Modified Levene Test which states that
there is a constant Variance with equal means.

We also performed the Normality Test and found out that the Normality is Ok and the plot
suggested that the normality seems to be Ok with slight S with shorter tails on either ends and
got supported with the Normality Test. The plot between the residual and the normal scores
appeared pretty linear.

Mais conteúdo relacionado

Mais procurados

Mth 4108-1 b (ans)
Mth 4108-1 b (ans)Mth 4108-1 b (ans)
Mth 4108-1 b (ans)outdoorjohn
 
Mth 4108-1 c (ans)
Mth 4108-1 c (ans)Mth 4108-1 c (ans)
Mth 4108-1 c (ans)outdoorjohn
 
Lesson 26: Optimization II: Data Fitting
Lesson 26: Optimization II: Data FittingLesson 26: Optimization II: Data Fitting
Lesson 26: Optimization II: Data FittingMatthew Leingang
 
5 marks scheme for add maths paper 2 trial spm
5 marks scheme for add maths paper 2 trial spm5 marks scheme for add maths paper 2 trial spm
5 marks scheme for add maths paper 2 trial spmzabidah awang
 
Hsn course revision notes
Hsn course revision notesHsn course revision notes
Hsn course revision notesMissParker
 
Lesson 25: Unconstrained Optimization I
Lesson 25: Unconstrained Optimization ILesson 25: Unconstrained Optimization I
Lesson 25: Unconstrained Optimization IMatthew Leingang
 
Alg2 Final Keynote
Alg2 Final KeynoteAlg2 Final Keynote
Alg2 Final KeynoteChris Wilson
 
Elementary differential equations with boundary value problems solutions
Elementary differential equations with boundary value problems solutionsElementary differential equations with boundary value problems solutions
Elementary differential equations with boundary value problems solutionsHon Wa Wong
 

Mais procurados (16)

Mth 4108-1 b (ans)
Mth 4108-1 b (ans)Mth 4108-1 b (ans)
Mth 4108-1 b (ans)
 
Mth 4108-1 c (ans)
Mth 4108-1 c (ans)Mth 4108-1 c (ans)
Mth 4108-1 c (ans)
 
Lesson 26: Optimization II: Data Fitting
Lesson 26: Optimization II: Data FittingLesson 26: Optimization II: Data Fitting
Lesson 26: Optimization II: Data Fitting
 
AMU - Mathematics - 2004
AMU - Mathematics  - 2004AMU - Mathematics  - 2004
AMU - Mathematics - 2004
 
Chapter 09
Chapter 09Chapter 09
Chapter 09
 
L2 number
L2 numberL2 number
L2 number
 
5 marks scheme for add maths paper 2 trial spm
5 marks scheme for add maths paper 2 trial spm5 marks scheme for add maths paper 2 trial spm
5 marks scheme for add maths paper 2 trial spm
 
Deflection in beams 1
Deflection in beams 1Deflection in beams 1
Deflection in beams 1
 
Hsn course revision notes
Hsn course revision notesHsn course revision notes
Hsn course revision notes
 
Lesson 25: Unconstrained Optimization I
Lesson 25: Unconstrained Optimization ILesson 25: Unconstrained Optimization I
Lesson 25: Unconstrained Optimization I
 
Big-M Method Presentation
Big-M Method PresentationBig-M Method Presentation
Big-M Method Presentation
 
Alg2 Final Keynote
Alg2 Final KeynoteAlg2 Final Keynote
Alg2 Final Keynote
 
Elementary differential equations with boundary value problems solutions
Elementary differential equations with boundary value problems solutionsElementary differential equations with boundary value problems solutions
Elementary differential equations with boundary value problems solutions
 
14 dummy
14 dummy14 dummy
14 dummy
 
Day02
Day02 Day02
Day02
 
Sistem bilangan
Sistem bilanganSistem bilangan
Sistem bilangan
 

Semelhante a Statistics Project1

maths jee formulas.pdf
maths jee formulas.pdfmaths jee formulas.pdf
maths jee formulas.pdfGARRYB4
 
Jacobi and gauss-seidel
Jacobi and gauss-seidelJacobi and gauss-seidel
Jacobi and gauss-seidelarunsmm
 
W ee network_theory_10-06-17_ls2-sol
W ee network_theory_10-06-17_ls2-solW ee network_theory_10-06-17_ls2-sol
W ee network_theory_10-06-17_ls2-solAnkit Chaurasia
 
Ch 02 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 02 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片Ch 02 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 02 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片Chyi-Tsong Chen
 
StructuralTheoryClass2.ppt
StructuralTheoryClass2.pptStructuralTheoryClass2.ppt
StructuralTheoryClass2.pptChristopherArce4
 
Statistical Tools for the Quality Control Laboratory and Validation Studies
Statistical Tools for the Quality Control Laboratory and Validation StudiesStatistical Tools for the Quality Control Laboratory and Validation Studies
Statistical Tools for the Quality Control Laboratory and Validation StudiesInstitute of Validation Technology
 
Embedding and np-Complete Problems for 3-Equitable Graphs
Embedding and np-Complete Problems for 3-Equitable GraphsEmbedding and np-Complete Problems for 3-Equitable Graphs
Embedding and np-Complete Problems for 3-Equitable GraphsWaqas Tariq
 
Basic concepts of curve fittings
Basic concepts of curve fittingsBasic concepts of curve fittings
Basic concepts of curve fittingsTarun Gehlot
 
5 10c exercise_index the aluminum tilt series_ex on saed do not know cell pars
5 10c exercise_index the aluminum tilt series_ex on saed do not know cell pars5 10c exercise_index the aluminum tilt series_ex on saed do not know cell pars
5 10c exercise_index the aluminum tilt series_ex on saed do not know cell parsJoke Hadermann
 
Problemas (67) del Capítulo III de física II Ley de Gauss
Problemas (67) del Capítulo III de física II   Ley de GaussProblemas (67) del Capítulo III de física II   Ley de Gauss
Problemas (67) del Capítulo III de física II Ley de GaussLUIS POWELL
 
Design of a Lift Mechanism for Disabled People
Design of a Lift Mechanism for Disabled PeopleDesign of a Lift Mechanism for Disabled People
Design of a Lift Mechanism for Disabled PeopleSamet Baykul
 
centroid & moment of inertia
centroid & moment of inertiacentroid & moment of inertia
centroid & moment of inertiasachin chaurasia
 

Semelhante a Statistics Project1 (20)

maths jee formulas.pdf
maths jee formulas.pdfmaths jee formulas.pdf
maths jee formulas.pdf
 
Jacobi and gauss-seidel
Jacobi and gauss-seidelJacobi and gauss-seidel
Jacobi and gauss-seidel
 
W ee network_theory_10-06-17_ls2-sol
W ee network_theory_10-06-17_ls2-solW ee network_theory_10-06-17_ls2-sol
W ee network_theory_10-06-17_ls2-sol
 
BDPA IT Showcase: Production Planning Tools
BDPA IT Showcase: Production Planning ToolsBDPA IT Showcase: Production Planning Tools
BDPA IT Showcase: Production Planning Tools
 
Ch 02 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 02 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片Ch 02 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 02 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
 
StructuralTheoryClass2.ppt
StructuralTheoryClass2.pptStructuralTheoryClass2.ppt
StructuralTheoryClass2.ppt
 
SMT1105-1.pdf
SMT1105-1.pdfSMT1105-1.pdf
SMT1105-1.pdf
 
Ch13s
Ch13sCh13s
Ch13s
 
Statistical controls for qc
Statistical controls for qcStatistical controls for qc
Statistical controls for qc
 
Statistical Tools for the Quality Control Laboratory and Validation Studies
Statistical Tools for the Quality Control Laboratory and Validation StudiesStatistical Tools for the Quality Control Laboratory and Validation Studies
Statistical Tools for the Quality Control Laboratory and Validation Studies
 
Solution 3 i ph o 35
Solution 3 i ph o 35Solution 3 i ph o 35
Solution 3 i ph o 35
 
Embedding and np-Complete Problems for 3-Equitable Graphs
Embedding and np-Complete Problems for 3-Equitable GraphsEmbedding and np-Complete Problems for 3-Equitable Graphs
Embedding and np-Complete Problems for 3-Equitable Graphs
 
Basic concepts of curve fittings
Basic concepts of curve fittingsBasic concepts of curve fittings
Basic concepts of curve fittings
 
Equation of second degree
Equation of second degreeEquation of second degree
Equation of second degree
 
5 10c exercise_index the aluminum tilt series_ex on saed do not know cell pars
5 10c exercise_index the aluminum tilt series_ex on saed do not know cell pars5 10c exercise_index the aluminum tilt series_ex on saed do not know cell pars
5 10c exercise_index the aluminum tilt series_ex on saed do not know cell pars
 
Shell theory
Shell theoryShell theory
Shell theory
 
Ch03 5
Ch03 5Ch03 5
Ch03 5
 
Problemas (67) del Capítulo III de física II Ley de Gauss
Problemas (67) del Capítulo III de física II   Ley de GaussProblemas (67) del Capítulo III de física II   Ley de Gauss
Problemas (67) del Capítulo III de física II Ley de Gauss
 
Design of a Lift Mechanism for Disabled People
Design of a Lift Mechanism for Disabled PeopleDesign of a Lift Mechanism for Disabled People
Design of a Lift Mechanism for Disabled People
 
centroid & moment of inertia
centroid & moment of inertiacentroid & moment of inertia
centroid & moment of inertia
 

Mais de shri1984

Metal Removal Processes
Metal Removal ProcessesMetal Removal Processes
Metal Removal Processesshri1984
 
Turning Programming
Turning ProgrammingTurning Programming
Turning Programmingshri1984
 
Cnc Offsets
Cnc OffsetsCnc Offsets
Cnc Offsetsshri1984
 
Cnc Manual Operations
Cnc Manual OperationsCnc Manual Operations
Cnc Manual Operationsshri1984
 
Cnc Maching Center
Cnc Maching CenterCnc Maching Center
Cnc Maching Centershri1984
 
Cnc Coordinates
Cnc CoordinatesCnc Coordinates
Cnc Coordinatesshri1984
 
Cnc Turning
Cnc TurningCnc Turning
Cnc Turningshri1984
 
Simulation Project
Simulation ProjectSimulation Project
Simulation Projectshri1984
 
Simulation Project 2
Simulation Project 2Simulation Project 2
Simulation Project 2shri1984
 
Probabilistic decision making
Probabilistic decision makingProbabilistic decision making
Probabilistic decision makingshri1984
 
Multi attribute decision making
Multi attribute decision makingMulti attribute decision making
Multi attribute decision makingshri1984
 
Advanced engineering economy
Advanced  engineering economyAdvanced  engineering economy
Advanced engineering economyshri1984
 
Statistics project2
Statistics project2Statistics project2
Statistics project2shri1984
 
Time Study Analysis Metrics
Time Study Analysis MetricsTime Study Analysis Metrics
Time Study Analysis Metricsshri1984
 
Logistics Transportation
Logistics TransportationLogistics Transportation
Logistics Transportationshri1984
 
Shriraam Madanagopal Internship Report
Shriraam Madanagopal Internship ReportShriraam Madanagopal Internship Report
Shriraam Madanagopal Internship Reportshri1984
 
Logistics Distribution Systems Design
Logistics Distribution Systems DesignLogistics Distribution Systems Design
Logistics Distribution Systems Designshri1984
 

Mais de shri1984 (18)

Metal Removal Processes
Metal Removal ProcessesMetal Removal Processes
Metal Removal Processes
 
Turning Programming
Turning ProgrammingTurning Programming
Turning Programming
 
Cnc Offsets
Cnc OffsetsCnc Offsets
Cnc Offsets
 
Cnc Manual Operations
Cnc Manual OperationsCnc Manual Operations
Cnc Manual Operations
 
Cnc Maching Center
Cnc Maching CenterCnc Maching Center
Cnc Maching Center
 
Cnc Coordinates
Cnc CoordinatesCnc Coordinates
Cnc Coordinates
 
Cad Cam
Cad CamCad Cam
Cad Cam
 
Cnc Turning
Cnc TurningCnc Turning
Cnc Turning
 
Simulation Project
Simulation ProjectSimulation Project
Simulation Project
 
Simulation Project 2
Simulation Project 2Simulation Project 2
Simulation Project 2
 
Probabilistic decision making
Probabilistic decision makingProbabilistic decision making
Probabilistic decision making
 
Multi attribute decision making
Multi attribute decision makingMulti attribute decision making
Multi attribute decision making
 
Advanced engineering economy
Advanced  engineering economyAdvanced  engineering economy
Advanced engineering economy
 
Statistics project2
Statistics project2Statistics project2
Statistics project2
 
Time Study Analysis Metrics
Time Study Analysis MetricsTime Study Analysis Metrics
Time Study Analysis Metrics
 
Logistics Transportation
Logistics TransportationLogistics Transportation
Logistics Transportation
 
Shriraam Madanagopal Internship Report
Shriraam Madanagopal Internship ReportShriraam Madanagopal Internship Report
Shriraam Madanagopal Internship Report
 
Logistics Distribution Systems Design
Logistics Distribution Systems DesignLogistics Distribution Systems Design
Logistics Distribution Systems Design
 

Último

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 

Último (20)

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 

Statistics Project1

  • 1. Advanced Engineering Statistics-IE 5318- Fall 2008-Project1 Report: Project Proposal: Fe Na2SO3 The main objective of this project is to develop a model for ppm %w the concentration of sodium sulfite with the concentration 1.8 17.2 of Iron present in waste water. We have collected waste water samples on each day for 23 days so our number of the 1.0 16.2 observation is 23. Performed lab analysis (ICP) to find out 0.8 20.7 Fe and Sodium sulfite from plant production data. 1.1 23.3 1.6 18.4 0.8 20.9 1.1 22.8 1.6 17.7 2.6 21.2 1.2 20.8 1.4 22.2 1.4 12.7 1.4 18.1 2.0 16.3 2.8 16.3 4.1 19.2 3.6 20.1 2.0 21.7 1.9 22.3 6.0 19.3 8.9 18.7 11.2 14.9 0.9 15.1
  • 2. Calculations of yhat and Residuals (e)
  • 3. Simple Linear Regression: In Simple Linear Regression we take the Na2 So3 in the Y-Axis which forms the Response or Dependent Variable Vs the Fe in the X-axis which forms the Factor or the predictor variable which is Independent Variable.
  • 4. Using the SAS 9.1 version we generate the above graph and data set as shown below Calculation of MSE, b1, b0. ∑x i = 61.2 ∑x 2 i = 317.82 Sx = 2.75949 b0 = Y − b1 X = 19.63779 ∑y i = 436.1 ∑y 2 i = 8446.39 Sy = 0.96691 X =2.660 Y =18.960 n n n MSE = ∧ 2 ∑y 2 i − b0 ∑ y i − b1 ∑ xi y i σ = i =1 i =1 i =1 = n−2 7.97739
  • 5. The Model of SLR: The above equation is explained in as given below. Yi = β0 + β1Xi + εi  β0 is the y-intercept  β1 is the slope.  The random error term εi = (equation error + measurement error) has a mean E {εi} = 0  The constant variance V {εi} = σ2  ε is uncorrelated or co-variance Cov(εi, εj) = 0 for all i, j, i ≠ j.  i = 1, ……, n Regression Line Fit: The Regression line fit can be explained as follows: We find the linearity between the Na2So3 (Response) and Fe (Predictor) by finding the unknown parameters of β0 and β1 with the values of b0 and b1 respectively. From our class notes the estimated Regression function is expressed as:
  • 6. ^ Therefore substituting the valuesYof bb0+andi b1 i = 0 b1X obtained by us from our SAS in the above equation we get ^ Yi = 19.63779-0.2544*Xi ∧ In our case we, there is a linearity associated with y i and xi Inferences on our Parameters: For a 95% Confidence Interval in our case for the β1 The Formula for the Confidence Interval is  α  C.I = b1 ± t 1 − ; n − 2  s{b1 }  2  From our SAS Output we get To find σ 2 MSE = (RMSE) 2 = (2.82443)2 = 7.97739 2  n  ∑ x  2  i =1 i  = S x (n-1) n ∑ xi − n i =1 2 = (2.75949)2 (22)= 167.5252
  • 7. σ2 2  n  7.97739 s{b1}=  ∑ xi  = = 0.22688 n 167.5252 ∑ xi −  i =1n  i =1 2 At α = 0.05 ⇒ C.I = 19.63779 ± t ( 0.975,21) * 0.22688 ⇒ C.I = 19.63779 ± (2.080) * 0.22688 ⇒ C.I = (19.1658, 20.1097) Therefore we conclude that we are 95% confident that the percentage of Na2 So3 increases between our obtained range of 19.1658 and 20.1097 for each unit increase in Fe. Model Fit: Hypothesis test for slope: The usage of T-test helps us find the linear relationship between Fe and Na2 So3.The t* value is obtained from SAS output and we can try matching it with the t-cut off value which is obtained from the t-distribution table. T-test for β1 Test: H0: β1 = 0 α = 0.05 H1: β1 ≠ 0 Our Decision Rule is as follows:  α  If t* > t 1 − ; n − 2  → reject H0; Else, Fail to Reject H0  2  b1 t* = s{b1 } 19.63779 ⇒ ⇒ 86.555 0.22688  α  t 1 − ; n − 2  = t(.975,21) = 2.080  2   α  t* > t 1 − ; n − 2  As per our Decision rule we reject H0  2 
  • 8. This above decision of ours make us state that we are 95% Confident that our Na2 So3 and Fe has a Linear Relationship. Confidence Interval for Y-intercept: ⇒ 100(1-α)% CI for β0. _2 Two sided: b0 ± t(1-α/2; n-2) s{b0} ⇒ s {b0}= MSE{1 / n + X / ∑ () 2 = 0.84339 Applying two sided tests =19.63779± t(0.975,21) *0.84339=(17.8835,21.392) Therefore we conclude that we are 95% confident that the percentage of Na2 So3 increases between our obtained range of 17.8835 and 21.392 for each unit increase in Fe. Analysis of Variance: From our SAS OUTPUT, Regression Sum of Squares (SSR) = 10.02968 Error Sum of Squares (SSE) = 167.5251 Total Sum of Squares (SSTO) =177.55478 Regression Mean square (MSR) = 10.02968 Error Mean Square (MSE) = 7.97739 Coefficient of Determination (R2) = 0.0565 Coefficient of Determination: R2 = SSR/SSTO = 1-(SSE/SSTO) =0.0565 0 ≤ R2 ≤ 1. It measures the extent to which the regression model fits the data line. Coefficient of Correlation r=± R2 Since the slope ( ) in our model is positive we only consider the positive value of r. r= R2 r = 0.2377 ≈ 1
  • 9. ⇒ There is a strong linear relation between the Na2 So3 and the Fe. ANOVA Table: Source DF SS SS/DF F p-value Regression 1 10.02968 10.02968 1.26 0.2784 Error 21 167.5251 7.97739 Total 22 177.55478 Confidence Interval for Mean Response: 1. 95% Confidence Interval for xh= 10 s{ } = = 1.76638 s{ } = 1.76638  α  C.I = ± t 1 − ; n − 2  * s{ }  2  = 196.1235 ± (0.975,21)*1.76638 = 196.1235± 2.08*1.76638 ⇒ C.I for xh=10 = (192.4494, 199.7975) ⇒ We are thus 95% confident that the mean value of the probability distribution of Na2 S03 lies between 192.4494 and 199.7975 when the xh=10 ppm of Fe. 2. 95% Prediction Interval for xh=10 s{pred} =  α  = 3.33128 P.I = ± t 1 − ; n − 2  * s{pred}  2  =196.1235 ± (.975,21)*3.33128
  • 10. =196.1235 ± 6.929 ⇒ P.I for xh =10 = (189.1945, 203.0525) ⇒ We predict with 95% confidence that the actual ppm of Fe obtained when the xh=10 ppm of Fe lies between 189.1235 and 203.0525. 3. 95% Confidence Band for xh= 10 ± w s{ } Where = 2F (1-α/2,2,n-2) =2*F (0.95, 2, 21) =2*19.425 = 38.85 = 6.2329~6.233 C.B = 196.1235 ± 6.233*1.76638 = 196.1235 ± 11.0098 =(185.1137,207.1333) ⇒ C.B for xh=10 = (185.1137, 207.1333) ⇒ We have 95% confidence that true regression line lies certainly between the upper and lower band of the CB. So for xh=10 of Fe obtained will lie between 185.1137 and 207.1333 From the above results we infer that CB is wider than CI = > = , which is always true. The Confidence Band limits for several xh along the range of x:
  • 11. Using the data from this table we plot the confidence band with the fitted regression line and data points: Residual Analysis: The residual analysis is to verify the following model assumptions • A Linear model is reasonable-Mandatory. • The residual have constant Variance -Mandatory. • The residuals are normally distributed -Optional. • The residuals are uncorrelated -Mandatory. • Model is free from outliers –Optional. Plots: Linearity Analysis: Residual Plot against the predicted variable determines whether a linear regression function is appropriate for our data. From our Plot we infer that
  • 12. The points are randomly scattered and hence the linearity is OK and there is NO funnel shape observation which makes our model have a constant variance with no outliers. Time Series plot: Time series data often arise when monitoring industrial processes or tracking corporate business metrics. Time series analysis accounts for the fact that data points taken over time may have an internal structure (such as autocorrelation, trend or seasonal variation) that should be accounted for. Referred from: http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc4.htm From our SAS Output we Infer our Time Series Plot as Follows:
  • 13. The Time Series plot has no Significance in our Analysis, since our data is not based on time. Normality Analysis: Normality check is done to check whether the residuals are normally distributed, which is one of the desired assumption for simple linear regression model . Inference from the graph: The Graph looks pretty straight. Normality seems to be Ok with slight S with shorter tails on either ends. We can do a normality test to further make our graph analysis clearer. Normality test: H0 : Normality is OK H1 : Normality is Violated. Our Decision Rule will be : If < c(α,n) → Reject H0 else we fail to H0 Take α=0.05; c(0.1,23)=0.964 =0.97415 (from SAS output corresponding ENRM VALUE) ⇒ Since > c(α,n) ⇒ Normality is OK.
  • 14. Modified Levene Test for Variances: TEST: H0 : Means are Equal H1 : Means are not equal P=0.6576, α=.05 ⇒ P > α we fail to reject H0 Means are Equal Equal Variance Test: TEST: H0 : σd1 = σd2 Variance is Constant H1 : σd1 ≠ σd2 Variance is not constant P=0.1990; α= 0.05 ⇒ P > α We Fail to Reject H0 Variance is Constant. From Modified Levene test, we conclude Constant Variance is constant, which is in accordance to what we have observed from the Plot Residual Vs Yhat and Residual Vs x. So we need not do any Transformations.
  • 15. Conclusion: From all the above tests performed we conclude the following: There is a linear relation between the Na2So3(Response) and Fe(Predictor). The fitted regression line in our model is represented by the equation: ^ Y = 19.637739 -0.2544 x i The T-test further proves that there is a linear regression to relate Na2So3 to Fe. The R value ( =0.0565) in our model indicates that there is a good fit and it explains everything in estimating the Na2 So3 considering the Fe as the predictor variable. We took the significance level as α=0.05 to conduct all our tests and our confidence level is 95% for all conclusions. We calculated the Confidence intervals for the intercept of the regression function and also we calculated the Confidence interval, Prediction interval and Confidence band for a given value of ( =10). From the above mentioned calculations, we found that the prediction interval is wider than the Confidence Interval for the same Confidence level of 95%.We also found that the Confidence band is wider than the Confidence Interval. We did Calculate the ANOVA table and found the Degrees of freedom and their corresponding SSR,SSE,SSTO, MSR,MSE, F*,P-value which gave us the relationship between the Na2So3 and Fe. We also did find out that the Variances of the error terms were normal and constant and also did the residual analysis and found that the linearity was Ok with no funnel shape which attributed to the constant variance. This was further supported by Modified Levene Test which states that there is a constant Variance with equal means. We also performed the Normality Test and found out that the Normality is Ok and the plot suggested that the normality seems to be Ok with slight S with shorter tails on either ends and got supported with the Normality Test. The plot between the residual and the normal scores appeared pretty linear.