This document discusses analysis of variance (ANOVA) and experimental designs, including complete randomized design (CRD), randomized complete block design (RCBD), and Latin square design (LSD). It provides details on the procedures for ANOVA calculations for one-way and two-way classifications and outlines the advantages and limitations of different experimental designs. The key steps in layout and analysis of a CRD are also demonstrated with an example.
1. BY
VISWANTH REDDY.S
DEPARTMENT OF PHARMACOLOGY
GOKARAJU RANGARAJU COLLEGE OF PHARMACY
2. Analysis of variance(ANOVA)
Experimental designs
CRD
RCBD
LSD
Applications of biostatistics
3. Its mainly employed for comparison of means of three
or more samples including the variations in each
sample.
this statistical technique first devoloped by R.A.Fisher
and was extensively used for agricultural experiments.
The analyis of variance is a method to estimate the
contribution made by each factor to the total
variation.the total variation splits in to the following
two components .
1.variation with in the samples
2.variation between the samples
4. There are two classifications for the analysis of variance
when we classify data based on one factor analysis it is known as one way
ANOVA
When we classify data on the basis of two factors which is known as two way
ANOVA
The technique of analysing variance in case of one factor and two factors is
similar.however , incase of onefactor analysis the total variance is
divided in to twoparts only
1. Variance between samples
2. Variance with in the samples.
the variance with in the samples is residual variance.
5. In case of two factor analysis ,the total variance is divided in to
3parts viz.,,
variance due to factor number one
Variance due to factor number two
Residual variance
PROCEDURE FOR CALCULATING F-STATISTIC:
T-test employed for two mean samples
F-test is employed for comparison means of three or more samples. in this
case , the variation between the treatments and the replicates are shown in
columns and rows, respectively. Now we have to find out whether these
variations are significant and if so what level of significance, for this purpose
calculate the F-statistic which is the ratio of variances. The detailed procedure
as follows:
6. TREATMENTS
1 2 3
1 X11 X21 X31---------------
∑XR1
R
E
P 2 X12 X22 X32----------------
∑XR2
L
I
C 3 X13 X23 X33-----------------
∑XR3
A
T
E
S
∑X= ∑XC1 ∑ XC21+ ∑ XC22+ ∑XC32---------------------------------------------------------------A
∑X2= ∑XC2 ∑XC3= GRAND
TOTAL(G) (∑X)2/nc= (∑ XC1)2/nc1+ (∑ XC2)2/nc2+ (∑XC3)2/nc3-----------------B
(∑X)2/nr= (∑ XR1)2/nr1+ (∑ XR2)2/nr2+ (∑XR3)2/nr3-------------------C
C.F = (∑X)2/n= G2/n---------------------------------------------------------------------D
Now total sum of squares=A-D
between treatments sum of squares=B-D
between rows sum of square= C-D
residual sum of squares= (A-D)-[(B-D)+(C-D)]
7. SOURCE OF DEGREES OF SUM OF MEANS OF
VARIATION FREEDOM(d.f) SQUARES(SS) SQUARES(MS)
BETWEEN c-1 B-D B-D/c-1
TREATMENTS
BETWEEN ROWS r-1 C-D C-D/r-1
RESIDUAL (C-1)(r-1) (A-B-[(B-D)+(C-D)] (A-B-[(B-D)+(C-D)]/(C-
1)(r-1)
TOTAL Cr-1 A-D
8. TREATMENTS
1 2 3
1 X11 X21 X31
R
E
P 2 X12 X22 X32
L
I
C 3 X13 X23 X33
A
T
E
S
∑X= ∑XC1 ∑XC2 ∑XC3= GRAND
TOTAL(G)
1. Find the total sum of squares ∑X2= ∑ XC21+ ∑ XC22+ ∑XC32--------A
2. Square the coloumn total and divide separately each total by number
of observations inn each coloumn denoted by C1,C2,C3------etc
(∑X)2/nc= (∑ XC1)2/nc1+ (∑ XC2)2/nc2+ (∑XC3)2/nc3-----------------B
9. 3.Find the grand total
∑X= ∑XC1 + ∑XC2 + ∑XC3= GRAND TOTAL(G)
4.Square the grand total and divide it by the number of observations(n).
correction factor, C.F.=( ∑X)2/n or GT2/n---------------------------------D
5. Calculate the F value
F=BET WEEN TREATMENT MEAN SQUARE/RESIDUAL MEAN SQUARE
SOURCE OF DEGREES OF SUM OF MEANS OF F VALUE
VARIATION FREEDOM(d.f) SQUARES(SS) SQUARES(MS)
BETWEEN c-1 B-D B-D/c-1
TREATMENTS /
B-D/c-1 A-B/C(r-1)
RESIDUAL C(r-1) A-B A-B/C(r-1)
TOTAL Cr-1 A-D
10. In one way classification we have studied influence of one factor.however ,
in two way classification we will study the influence of two factors.
In such cases , data are classified based on two criteria..for example , the
yield of different varieties of wheat may be affected by the application of
different fertilizers.
Therefore analysis of variance can be used to test the effects of these two
factors simultaneosly.
The calculation in two factors analysis is more or less the same In addition
to the calculation based on rows.
In one way classification columns are taken into consideration . However in
two way analysis both coloumns and rows are considered.
11. TREATMENTS
1 2 3
1 X11 X21 X31---------------
∑ XR1
R
E
P 2 X12 X22 X32----------------
∑XR2
L
I
C 3 X13 X23
X33----------------- ∑XR3
A
T
E
S ∑X2= ∑ XC21+ ∑ XC22+ ∑XC32---------------------------------------------------------------A
∑X= (∑X)2/nc= (∑ XC1)2/nc1+ (∑∑XC2
∑XC1 ∑XC3=
XC2)2/nc2+ (∑XC3)2/nc3-----------------B
GRAND TOTAL(G)
(∑X)2/nr= (∑ XR1)2/nr1+ (∑ XR2)2/nr2+ (∑XR3)2/nr3-------------------C
C.F = (∑X)2/n= G2/n---------------------------------------------------------------------D
Now total sum of squares=A-D
between treatments sum of squares=B-D
between rows sum of square= C-D
residual sum of squares= (A-D)-[(B-D)+(C-D)]
12. SOURCE OF DEGREES OF SUM OF MEANS OF
VARIATION FREEDOM(d.f SQUARES(SS) SQUARES(MS F VALUE
) )
BETWEEN c-1 B-D B-D/c-1
TREATMENTS /
B-D/c-1 (A-B-
[(B-D)+(C-D)]/(C-
1)(r-1)
BETWEEN r-1 C-D C-D/r-1
C-D/r-1/(A-B-
ROWS
[(B-D)+(C-D)]/(C-
1)(r-1)
RESIDUAL (C-1)(r-1) (A-B-[(B-D)+(C- (A-B-[(B-D)+(C-D)]/
D)] (C-1)(r-1)
TOTAL Cr-1 A-D
13.
14. A statistical design is a plan for the collection and analysis of
data.
It mainly deals with the following parameters..
However the selection of an efficient design requires careful
planning in advance of data collection and also analysis
A B D A A B D C
C D B C C D B A
B A D C B A D C
15. To eliminate bias
To ensure independence among observations
Required for valid significance tests and interval estimates
Low High
Old New Old New Old New Old New
In each pair of plots, although replicated, the new variety is
consistently assigned to the plot with the higher fertility level.
16. The repetition of a treatment in an experiment
A B D A
C D B C
B A D C
17. Ex:
If physicians wants to know whether a
particular drug which has been invented will be
benificial in the treatment of particular disease
A farmer wants to know whether new type of
fertilizer will give him better yields..he will frane
his investigation interms of some suitable
hypothesis.
There are many types of experimental designs…
in which the most imp are as follows….
18. DEPT OF PHARMACOLOGY
Complete randomized design(CRD)
Randomized complete block design(RCBD)
Latin square design(LSD)
19. DEPT OF PHARMACOLOGY
Where the treatments are assigned completetly
at random so that each treatment unit has the
same chance of receiving any one treatment.
This is suitable for only the expriment material
is homogenous.(ex:laboratory experiments,
green house studies etc.)
Not suitable for heterogenous study.(ex: field
experiments)
20. Advantages :
Simple and easy
Provides maximum number of degrees of freedom
Disadvantages:
Onlysuitable for small number of treatments and for
homogenous experimental material.
Low precision if the plots are not uniform
A B D A
C D B C
B A D C
21. Simplest and least restrictive
Every plot is equally likely to be assigned to
any treatment
A B D A
C D B C
B A D C
22. We have an experiment to test three varieties:
the top line from Oregon, Washington, and
Idaho to find which grows best in our area -----
t=3, r=4
A1 1
12
6
5
2 3 4
A A
5 6 7 8
A
9 10 11 12
23. DEPT OF PHARMACOLOGY
Layout of CRD:
The step by step procedures for randamization and layout of a
CRD are given for a field experiment with four treatments with
five replications.
Determine the total number of experimental units (n) as the
number of treatments and number of replications.
n=r×t→5×4=20
The entire experimental material is divided in to “n” number of
experiments.
ex: five treatments with four replicatons . We need 20
experimental units.the 20 units are numberd as follows……
24. 1 2 3 4 5
6 7 8 9 10
11 12 13. 14 15
16 17 18 19 20
Assign the treatments to the experimental units by 3 digit random
numbers , selected from random number table.
The random numbers written in order and are ranked , however
the lowest random number gives rank1, the highest rank allotted
to large number. These ranks corresponds to unit number
Then the first set of r units are alloted to treatment T 1
Then the next set of r units are alloted to treatment T2
Then the other set of r units T3 & so on…
26. DEPT OF PHARMACOLOGY
Final layout:
1 2 3 4 5
T3 T1 T5 T2 T5
6 7 8 9 10
T4 T1 T3 T4 T4
11 12 13 14 15
T5 T4 T2 T3 T1
16 17 18 19 20
T3 T1 T2 T2 T5
27. Analysis of variance:
There are two sources of variation among these
observations obtained from a CRD trial.
1. Treatment variation
2. Experimental error
The relative size of the two is used to indicate
whether the observed difference among the
treatment is real or due to chance.
28. DEPT OF PHARMACOLOGY
Calculations:
1. Correction factor(C.F)= (GT)2/n
2. Total sum of squares(total ss)=total ss-c.f
3. Treatment sum of squares(TSS)=TSS-cf
4. Error sum of squares(ESS)=total ss – TSS
These results are summarized in the ANOVA table & the mean squares
and F are calculated.
ANOVA table:
Source of df ss ms F
variation
treatments t-1 TSS TMS=TSS/t-1 TMS/EMS
Error n-t ESS EMS=ESS/n-t
Total n-1 Total SS
29.
30. Most widely used experimental designs in agricultural
research.
The design also extensively used in the fields of
biology, medical, social sciences and also business
research.
Experimental material is grouped in to homogenous
sub groups… the sub group is commonly termed as
block.since each block will consists the entire set of
treatments , a block is equivalent to a replication.
31. Ex: in field experiments , the soil fertility is an important
character that influences crop responses.
Hence the treatments applied at random to relatively
homogenous units with in each block and replicated over all
the blocks, the design is known as a RBD.
divides the group of experimental units into n homogeneous
groups of size t.
These homogeneous groups are called blocks.
The treatments are then randomly assigned to the
experimental units in each block - one treatment to a unit in
each block.
32. A dvantages& Disadvantages of RCBD:
Advantages of RCBD:
this design has been shown to be more efficient or accurate than CRD for
most of types of experimental work . The elimination of between SS from
residual SS , usually results in a decrease of error of mean SS.
Flexibility is another advantage of RCBD. Large number of treatments can
be included in this design.
Dis advantages of RCBD:
not suitable for large number of treatments … because if the block size is
large it may be difficult to maintain homogenicity with in blocks.
Consequently error will be increased.
33. Layout of RCBD:
let us consider that the experiment is to be conducted on 4
blocks of land, each having 5 plots. Now we take in to
consideration five treatments , each replicated 4 times, we
divide the whole experimental area in to 4 relatively
homogenous blocks and each block into five plots or units.
Treatments allocated at random to the units of a block .
PLOTS
1 2 3 4
B 5
1 A E B D C
L
O E D C B A
C C B A E D
K
S A D E C B
34. The Anova Table for a randomized Block
Source of d.f ExperimentM.S.S
S.S. F
variation
Treatments t-1 SST SST/t-1 SST/t-1/SSE/(t-1)
(r-1)
Blocks r-1 SSB SSB/r-1 SSB/r-1/SSE/(t-1)
(r-1)
Error (t-1)(r-1) SSE SSE/(t-1)(r-1)
Total rt-1 total SS
35. By comparing the variance ratio of treatments with the
critical value of F we can find out if the different treatments
are significantly differe
The conclusion will be irrespective of the difference on
account of blocks.
Ex:
36.
37. A Latin Square experiment is assumed to be a three-factor
experiment.
The factors are rows, colum and treatm
ns ents.
It is assumed that there is no interaction between rows,
columns and treatments.
The degrees of freedom for the interactions is used to estimate
error
differ from randomized complete block designs in that the
experimental units are grouped in blocks in two different ways,
that is, by rows and columns.
A requirement of the latin square is that the number of
treatments, rows, and number of replications, columns, must be
equal; therefore, the total number of experimental units must
be a perfect square. For example, if there are 4 treatments,
38. Latin Square Designs
Selected Latin Squares
3 x 3 4 x 4
ABC ABCD ABCD ABCD ABCD
BCA BADC BCDA BDAC BADC
CAB CDBA CDAB CADB CDAB
DCAB DABC DCBA DCBA
5 x 5 6 x 6
ABCDE ABCDEF
BAECD BFDCAE
CDAEB CDEFBA
DEBAC DAFECB
ECDBA ECABFD
FEBADC
39. The layout LSD is shown below for an experiment with five treatments
A,B.C,D,E . The 5×5 LSD plan given as follows.
A B C D E
B A E C D
C D A E B
D E B A C
E C D B A
Later on the process of randomization is done with the help of table of
random numbers method. for this select 5 three digit random numbers.
Random numbers sequence rank
628 1 3
846 2 4
475 . 3 2
902 4 5
452 5 1
40. Now use the rank to represent the existing row number of the selected plan
and sequence to represents the row number of new plan.
However the third row of the selected plan (rank=3) becomes the
firstrow(sequence=1)then so on.....
C D A E B
D E B A C
B A E C D
E C D B A
A B C D E
The column should be randomized in the same way by using the same
procedure used for rearrangement… the five random numbers selected are
as follows:
Random numbers sequence rank
792 1 4
032 2 1
947 . 3 5
293 4 3
196 5 2
41. However , the rank will now used to represent the column number of
the plan obtained above and the sequence will be used to represent
the column number of the final plan.
In this way ,the fourth column of the above plan becomes the first
column of the final plan. In addition to this , the fifth column becomes
third: third becomes fourth and seconds becomes fifth.the final plan
which becomes the layout of the design , is as follows:
Row 1 2 3 4 5
number
1 E C B A D
2 A D C B E
3 C B D E A
4 B E A D C
5 D A E C B
42. ANALYSIS OF VARIANCE FOR LSD:
C.F=(GT)2/n
Total SS=∑X2-CF
Row SS=1/n ∑R2-CF
Column SS=1/n ∑C2-CF
Treatment SS=1/n ∑T2-CF
Error SS=Total SS-Row SS-ColumnSS-Treatment SS
43. The Anova Table for a Latin Square Experiment
Source d.f. SS M.S. F
Treat n-1 TSS TMS TMS/EMS
Rows n-1 RSS RMS RMS/EMS
Cols n-1 CSS CMS CMS/EMS
Error (n-1)(n-2) ESS EMS
Total n2 - 1 Total
SS
44. A dvantages
Controls more variation than CR or RCB
designs because of 2-way stratification. Results
in a smaller mean square for error.
Simple analysis of data
Analysis is simple even with missing plots.
Disadvantages
Number of treatments is limited to the number of
replicates which seldom exceeds 10.
If have less than 5 treatments, the df for controlling
random variation is relatively large and the df for
error is small.
45. Applications of biostatistics in pharmacy:
Applications of biostatistics in pharmacy:
Public health, including epidemiology, health services research, nutrition,
environmental health and healthcare policy & management.
Design and analysis of clinical trials in medicine
Population genetics, and statistical genetics in order to link variation in genotype with a
variation in phenotype. This has been used in agriculture to improve crops and farm
animals (animal breeding). In biomedical research, this work can assist in finding
candidates for gene alleles that can cause or influence predisposition to disease in
human genetics
Analysis of genomics data, for example from microarray or proteomics experiments.Often
concerning diseases or disease stages.
Ecology, ecological forecasting
Biological sequence analysis
Systems biology for gene network inference or pathways analysis
Statistical methods are beginning to be integrated into medical informatics, public health
informatics, bioinformatics and computational biology.
46. Test whether the new treatments / new diagnostics / new
vaccine works or not?
Ideally clinical trial should include all patients. Is it practically
possible? No We test the new treatments / new diagnostics /
new vaccine on a representative sample of the population
Statistics allows us to draw conclusions about the likely effect
on the population using data from the sample
BUT ALWAYS REMEMBER…
Statistics can never PROVE or DISPROVE a hypothesis, it only suggests to accept
or reject the hypothesis based on the available evidences
47. REFERENCES
Hinkelmann and Kempthorne (2008, Volume 1, Section 6.6: Completely
randomized design; Approximating the randomization test)
http://en.wikipedia.org/wiki/Analysis_of_variance
Montgomery (2001, Section 5-2: Introduction to factorial designs; The
advantages of factorials)
http://www.slideshare.net/Medresearch/analysis-of-variance-ppt-
powerpoint-presentation
http://www.synchronresearch.com/pdf_files/Application-Biostatistics-in-
Trials.pdf