1) The document describes an experiment to analyze the effect of a single variable on a thin film process using analysis of variance (ANOVA).
2) It provides instructions on how to design a single-factor experiment, including choosing a factor, determining levels of the factor, controlling other variables, replicating experiments, randomizing the order, and analyzing results.
3) As an example, it describes an experiment investigating the effect of DC bias voltage on silicon dioxide etching, with three voltage levels and four replicates at each level. The data is analyzed using ANOVA calculations including sums of squares.
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Lab 4 handout 043012
1. ChE/MatE 166 Advanced Thin Films San Jose State University
LAB 4: Experiment with a Single Factor: Analysis of
Variance (ANOVA)
Learning Objectives
1. Write clear objectives and statement of problem for an experiment.
2. Identify controllable and uncontrollable factors in an experimental set-up.
3. Choose a factor for a single factor ANOVA based on expected outcome and given
time and equipment constraints.
4. Determine the appropriate levels to be researched for a factor based on expected
outcome, equipment control, metrology precision, and time constraints.
5. Design an experiment using proper replication, randomization, and control of
variables.
6. Calculate the variation between levels using a sum of squares method.
7. Calculate the variation using an F-test.
8. Plot data of all the levels to show variation between levels.
9. Organize technical information into a clear and concise formal laboratory report.
Equipment
Each team will be using ANOVA techniques to determine the effect of a single variable
on the process. The group tasks are:
Team 1: Oxidation
Team 2: Wet or Dry Etching of Oxide
Team 3: Wet Etching of Al
Team 4: Deposition of Al
The equipment and metrology tools are the same as those discussed in Labs 2 and 3. See
those lab handouts for more details.
Design of a Single-Factor Experiment
In this lab module, you will be designing an experiment using a simple, single factor
approach. One way ANOVA (analysis of variance) is a method used to compare two or
more data sets to determine if the means are the same or different. This will allow you
to compare the data collected when you vary your factor to determine if the factor has a
statistically significant impact.
Note: The design steps and golf analogy used below are taken from D.C. Montgomery,
Design and Analysis of Experiments, John Wiley & Sons, (1997).
1. Statement of Problem
The first step in designing an experiment is to clearly state what the problem is or what
you are trying to investigate. The statement of your problem is your overall goal for
performing the experiment. Some typical goals for experimenting include improving
process yield, reducing variability or obtaining closer precision to targeted goals,
reducing time or cost, or improving the product.
Lab 1 Handout - 1
2. ChE/MatE 166 Advanced Thin Films San Jose State University
Example: My golf score varies wildly from game to game. I want to determine what
factors affect my golf score and quantify the influence each factor has on the overall
score.
2. Determination of Variables
In order to design a controlled experiment, all the factors that can influence the overall
outcome must be assessed. Variables can be classed into two groups: controllable
variables (ones you can change) and uncontrollable variables (ones that you can not
alter).
Example: I developed a list of variables that influence my golf score (or may influence
my golf score) and classified them as ones I can control and ones I can't:
weather: uncontrollable
riding a cart or walking: controllable
type of clubs used: controllable
type of ball used: controllable
difficulty of the course: uncontrollable
number of players assigned to play with me: uncontrollable
ability level of other players: uncontrollable
type of shoes: controllable
type of beverage: controllable
time of day: controllable
amount of time I warm-up at the driving range: controllable
3. Choosing a Factor to Investigate
You need to choose variables and controls that are within the limits of your experimental
framework (can give valuable data with the time, supplies, and cost you have allotted).
There are several different kinds of experiments you can design. A common experiment
is one factor at a time where the engineer varies one factor and holds all the rest
constant. (However, the problem with this is that it does not investigate the fact that
factors may be inter-related. In future labs, we will design full factorial experiments to
investigate the influence of multiple factors at once.)
When choosing a factor to investigate, you need to consider a number of things. The first
is that you want to choose a factor that previous experience or engineering knowledge
tells you is significant. You need to determine how the influence of the factor will be
quantified (what metrology and statistics you will use) to ensure that the influence is
measurable.
One-way ANOVA (analysis of variance) will investigate the influence of the factor by
performing the experiment at different levels (such high, medium, and low) and
statistically comparing the data. The value of the levels needs to be controllable. The
levels must be chosen to have a significant, quantifiable, and measurably different effect
on the end result. You need to choose a range of levels that is broad enough to
investigate the full impact of the variable.
Lab 1 Handout - 2
3. ChE/MatE 166 Advanced Thin Films San Jose State University
Example: I believe that the amount of time I warm up at the driving range will have a
significant impact on my game. My hypothesis is that if I do not warm up at all or only
for a brief time (less than 15 minutes), I will be stiff and my score will be poor.
However, if I warm up too much (over 40 minutes), I will be tired and my game score
will also suffer. I need to choose levels of warming up to test this hypothesis that are
significantly different enough. The levels I will test are warming up for 0, 10, 30, and 50
minutes. The levels are different enough to be controllable (I can accurately time 10
minutes versus 30 minutes. Investigating the influence of 10 minutes versus 12 minutes
would not be controllable. This is because when a warm up starts and ends is vague
enough that a 2-minute difference is not significant.) Note that in this experiment, I am
choosing a broad range of levels to investigate a maximum- minimum type of influence:
not warming up enough and warming up too much.
4. Control of Other Variables
In this experiment, you are only trying to investigate the influence of one factor. The
other controllable variables should be set to values that you have a tight control over and
are within normal range for a typical process.
Example: I want to wear my normal shoes and use my normal ball and clubs. Ideally,
these would all be done at the same time of day and under the same weather conditions.
5. Randomization & Replication
Once you run your experiments, you are going to make conclusions on the influence the
factors had. (In the golf example, I would compare golf scores to see if how the factors
caused the scores to vary.) To do this, you must also factor out the fact that the data
comes from two (or more) different experiments. In other words, you want to be able to
say the variation is due to those variables not just the fact that there is statistical variation
from run to run. You also need to guarantee that some other, uncontrollable variable
wasn’t influencing your results. To do this, your experiments need to contain both
replication and randomization.
Replication involves repeating the same run more than once such as playing a golf game
using the same warm up regiment twice. You could compare the scores of the replicates
to determine the run-to-run variation. The more replicates you have, the more statistical
confidence you will have in your data. However, there is a limit (based on time, money,
and supplies) of how many runs you can do.
Example: I have determined that my resources (time and money) allow me to play eight
rounds of golf. Therefore, I can have 2 replicates at every data point.
It is also important to randomize the order in which you run the experiments. This will
average out the influence of any other factors that may contributing that you can't control.
Example: Some factors that may influence my golf game but I can't control are the
weather and my assigned golf partners. Also, my score may improve just because I am
playing more games. To average out these factors, I perform the eight games with
varying levels in a random order.
Lab 1 Handout - 3
4. ChE/MatE 166 Advanced Thin Films San Jose State University
6. Monitoring the Experiment & Quantifying Results
For the golf game example, it is very easy to quantify the results. I total up my golf score
for the entire game. When I am collecting my data, I want to be careful to note as many
other factors as possible (weather, partners, physical condition) in case I need to refer to
that information when discussing my data.
7. Analyzing Results
Analysis of variance (ANOVA) is a statistical technique that can be used to quantify the
significant differences between levels. An analysis of variance is done by comparing the
variance within a level against the variance across the whole population. The sum of
squares is used (this is the sum of comparing each replicates value with the average
value). The square term is used to factor out whether the sample is above or below the
mean. That is, if the square wasn’t taken variations below the mean may cancel
variations above the mean to make it seem as if there is no variation.
ANOVA
Figure 1 shows the plot of three samples of data taken from three populations. It cannot
be determined whether the means are statistically the same or not by plotting the data.
ANOVA needs to be done to ascertain whether the means are same or not. Figure 2
details the steps needed to determine if the means of the three samples are statistically
different. Table 1 gives the factors calculated in ANOVA. Table 2 shows the raw data
labeled as it is used in an ANOVA calculation.
Is there any difference in mean?
+---------+---------+---------+---------+
9.0 10.0
= Sample from Level (treatment) 1
x1 x2 x3 = Sample from Level (treatment) 2
= Sample from Level (treatment) 3
Figure 1: Data from three samples where, x1 = Mean of sample 1; x 2 = Mean of sample ;
x 3 = Mean of sample 3.
Lab 1 Handout - 4
5. ChE/MatE 166 Advanced Thin Films San Jose State University
Population 1 Population 2 Population 3 …
Sample 1 Sample 2 Sample 3
(x1 , x 2 , x3 ,...xn ) (x1 , x 2 , x3 ,...xn ) (x1 , x 2 , x3 ,...xn )
_
Obtain each mean and standard deviation (x, s)
Perform F Test
Confidence intervals
Interpreting result
Figure 2: Steps needed to determine if the means of the three samples are statistically
different.
Table 1: Summary of ANOVA calculation.
Hypothesis H 0 : µ1 = µ 2 = µ 3 = µ 4 = .... = µ n
H 1 : At least one µ is different from the others
Where, µ i = the average of population i
F test MS Treatments
F0 = (See Tables 2 and 3 for the detailed calculation.)
MS E
Where, MS Treatments = the sum squares due to differences in the treatment means
MS E = the overall variation due to random error
Confidence MS E MS E
Interval on y i • − tα 2 , N − a ≤ µ i ≤ y i • + tα 2 , N − a
i th treatment n n
Where, y i • = the average in treatment i
N = total observations , a = number of treatments (levels)
n = number of observations in a treatment
Reject H 0 F0 > Fα ,a −1, N − a
Where d.f = (a − 1, N − a )
Table 2: Raw data of ANOVA.
Treatment Data Total Averages
(level)
1 y11 y12 … y1n y1• y1•
2 y 21 y 22 … y 2n y 2• y 2•
M M M … M M M
a y a1 ya2 … y an y a• y a•
y •• y ••
Lab 1 Handout - 5
6. ChE/MatE 166 Advanced Thin Films San Jose State University
ANOVA requires mathematical manipulation of all the values in Table 2. Table 3 details
the calculations needed in ANOVA.
Table 3: The analysis of variance table for single- factor.
Source of Sum of Square Degree Mean F0
Variation of Square
Freedom
Within a 2
a −1 MS Treatments = MS Treatments
treatment SS Treatments = n∑ ( y i• − y •• ) F0 =
i =1 SS Treatments MS E
a −1
Error SS E = SS T − SS Treatments N −a SS E
(within MS E =
treatments) N −a
Total 2
N −1
SS T = ∑∑ ( y ij − y •• )
a n
i =1 j =1
The sum of squares for levels is the sum squares due to differences in the treatment
means
a 2
SS Treatments = n∑ ( y i• − y •• )
i =1
n
Where, y i• is the sum of all the values in Level (treatment) i: y i• = ∑ y ij
j =1
y i•
y i• is the average of all the values in Level (treatment) i: y i• =
n
y ••
y •• is the average of all the samples (all levels) : y •• =
N
n is the number of replicates in that level
a is the number of levels: a ⋅ n = N
The sum of squares for the total population is given as:
2
SS T = ∑∑ ( y ij − y •• )
a n
i =1 j =1
where SStotal is the sum of square for the entire sample population
N is the total number of samples
The sum of squares errors is the overall variation due to random error
SS E = SS T − SS Treatments
An F-test is used to compare the sum of squares of the levels with the sum of squares of
the random errors.
Lab 1 Handout - 6
7. ChE/MatE 166 Advanced Thin Films San Jose State University
MS Treatments
F0 =
MS E
A criteria is used that says that if the calculated Fo is less than a critical value than the
levels are not statistically different. A confidence level needs to be set, typically an
α=0.05 confidence level is chosen. This signifies that the criterion is (1- α) or 0.95
accurate. If Fo is greater than Fcritical = Fα ,a −1, N − a (Rejection Region), the levels are
significantly different.
The Fcritical value can be found using a chart that is specific to the confidence level
chosen. (See Lab 2 Handout, Table 3 for Fcritical for α=0.05.) To use the table you need
the degrees of freedom of MSlevel and MSerror that are a-1 and N-a respectively.
Example
An experiment was run to investigate the influence of DC bias voltage on the amount
silicon dioxide etched from a wafer in a plasma etch process. Three different levels of DC
bias were being studied and four replicates were run in random order, resulting in the data
in Table 4.
Table 4: Amount of silicon dioxide etched as a function of DC bias voltage in a plasma
etcher.
DC Bias
(Volts) Amount Etched (in Angstroms) Total Average
1 2 3 4
s398 283.5 236 231.5 228 979 244.75
485 329 330 336 384.5 1379.5 344.875
571 474 477.5 470 474.5 1896 474
4254.5 354.54
Hypothesis
H 0 : µ1 = µ 2 = µ 3
H 1 : At least one µ is different from the others
Sum of Squares for Levels
a 2
SS Treatments = n∑ ( y i• − y •• )
i =1
[
= 4 (244.75 − 354.54) + (344.87 − 354.54) + (474 − 354.54)
2 2 2
]
= 105672
Degree of freedom = a − 1 = 3 – 1= 2
Sum of Squares for the Total Population
2
SS Total = SS T = ∑∑ ( y ij − y •• )
a n
i =1 j =1
[
= (238.5 − 354.54 ) + (236 − 354.54) + (231.5 − 354.54) + L + (474.5 − 354.54)
2 2 2 2
]
Lab 1 Handout - 7
8. ChE/MatE 166 Advanced Thin Films San Jose State University
= 109857
Degree of freedom= N − 1 = 12 – 1 = 11
Sum of Squares Errors
SS Error = SS Total − SS Treatments
= 109857-105672
= 4185
Degree of Freedom = N − a = 12 - 3 = 9
F-Statistic
SS Treatments 105672
MS Treatments = = = 52836
a −1 2
SS E 4185
MS E = = = 465
N −a 9
MS Treatments 52836
F0 = = = 113.63
MS E 465
Where, Fcritial = Fα ,a −1, N − a F0.05, 2,9 = 4.26 (determined from Lab 2, Table 3).
The summary of the ANOVA calculation for the example is given in Table 5.
Table 5: Summary of the ANOVA calculation for the plasma etcher example.
Source of Sum of Degree of Mean F0
Variation Square Freedom Square
Treatments 105672 2 52836 113.63
(Levels)
Error 4185 9 465
Total 109857 11
Because the F0 = 113.63 > 4.26 , we reject the null hypothesis ( H 0 ). We can conclude
that the mean of the oxide thickness etched at the different DC bias voltages are different.
Confidence Interval
With 95 % confidence, the true mean of each treatment is within:
tα 2, N − a t 0.025,9 = 2.262
MS E MS E
y i • − tα 2 , N − a ≤ µ i ≤ y i • + tα 2 , N − a
n n
465 465
µ1 = 244.75 − 2.262 ≤ µ1 ≤ 244.75 + 2.262
4 4
= 220.36 Angstroms ≤ µ1 ≤ 269.14 Angstroms
Lab 1 Handout - 8
9. ChE/MatE 166 Advanced Thin Films San Jose State University
465 465
µ 2 = 344.87 − 2.262 ≤ µ 2 ≤ 344.87 + 2.262
4 4
= 320.48 Angstroms ≤ µ 2 ≤ 369.26 Angstroms
465 465
µ 3 = 474 − 2.262 ≤ µ 3 ≤ 474 + 2.262
4 4
=449.61 Angstroms ≤ µ 3 ≤ 498.39 Angstroms
Results Using Minitab
From Minitab results (Table 6), we can conclude that we reject the null hypothesis
because the P-value is smaller than 0.05 and F0 = 113.63 > 5.71 . In other words, we
accept the alternative hypothesis that means the mean of the oxide thickness etched are
different, and the level of DC bias voltage affects the mean of the oxide etched.
Table 6: Results of ANOVA analysis of plasma etcher example using Minitab.
Analysis of Variance
Source DF SS MS F P
Factor 2 105672 52836 113.63 0.000
Error 9 4185 465
Total 11 109857
Individual 95% CIs For Mean
Based on Pooled StDev
Level N Mean StDev ---+---------+---------+---------+--
-
398 4 244.75 26.04 (--*--)
485 4 344.87 26.60 (--*--)
571 4 474.00 3.08 (--*--)
---+---------+---------+---------+--
240 320 400 480
Experimental Plan
Complete the one-way ANOVA worksheet to develop a detailed designed experiment to
investigate the influence of one variable on your assigned process.
The designed experiment needs to be carried out and analyzed in two weeks so project
management is a very important part of the design process. It may be necessary to divide
the workload up between team members.
Laboratory Report
Your report should include the following:
-Experiment objective
-A brief discussion about the theory of each process
-One-way ANOVA including controlled and uncontrolled variables, level chosen and why,
procedures,
-A detailed statistical analysis
Average value for each level
ANOVA analysis showing at table 3
-Plot of your data showing variation within level and trend between levels
Lab 1 Handout - 9
10. ChE/MatE 166 Advanced Thin Films San Jose State University
-Discussion of influence of your variable on the process
Is it statistically significant?
Is it what your hypothesis predicted? Why or why not?
-Discussion of possible future experiments to further improve the process
-Summary/ conclusion
Acknowledgments
The lab was created by Prof. G. Young and Prof. S. Gleixner of the Chemical and
Materials Engineering department and Irene Susanti Wibowo as part of her M.S. project
in Industrial Systems and Engineering.
Lab Evaluation
This portion of the lab will not be graded. I would greatly appreciate it if you did take the
time to answer the following questions in regards to your overall satisfaction with the lab.
Answers to these questions will be collected anonymously at the same time you turn in
your lab report. Please take the time to turn in the attached ANONYMOUS survey.
Even if you’ve got no comments please turn in the blank form so I know that you at least
read the survey questions.
Lab 1 Handout - 10
11. ChE/MatE 166 Advanced Thin Films San Jose State University
ANONYMOUS LAB SURVEY: NO NAME PLEASE
1.What was the most useful learning experience that this lab provided you?
2.Do you feel the one-way ANOVA instructions provided you with new information that will be
helpful in designing future experiments?
3. Do you feel the step by step process of designing an experiment provided you with new
information that will be helpful in designing future experiments?
4.What would you suggest to improve how the lab requires you to explore the concepts of
designing a single factor experiment?
Lab 1 Handout - 11