3. 1.1 Introduction
Fundamental Points
Clinical trials should have sufficient
statistical power to detect difference
between groups considered to be of
clinical interest. Therefore calculation
of sample size with provision for
adequate levels of significance and power
is a essential part of planning.
4. Five Key Questions Regarding the Sample
Size
What is the main purpose of the trial?
What is the principal measure of patients
outcome?
How will the data be analyzed to detect a
treatment difference? (The test statistic: t-test ,
X2 or CI.)
What type of results does one anticipate with
standard treatment?
Ho and HA, How small a treatment difference is
it important to detect and with what degree of
certainty? ( δ, α and β.)
How to deal with treatment withdraws and
protocol violations. (Data set used.)
5. SSC: Only an Estimate
Parameters used in calculation are
estimates with uncertainty and often
base on very small prior studies
Population may be different
Publication bias--overly optimistic
Different inclusion and exclusion
criteria
Mathematical models approximation
6. What should be in the protocol?
Sample size justification
Methods of calculation
Quantities used in calculation:
• Variances
• mean values
• response rates
• difference to be detected
7. Realistic and Conservative
Overestimated size:
unfeasible
early termination
Underestimated size
justify an increase
extension in follow-up
incorrect conclusion (WORSE)
8. What is α (Type I error)?
The probability of erroneously
rejecting the null hypothesis
(Put an useless medicine into the
market!)
9. What is β (Type II error)?
The probability of erroneously failing
to reject the null hypothesis.
(keep a good medicine away from
patients!)
10. What is Power ?
Power quantifies the ability of the
study to find true differences of
various values of δ.
Power = 1- β=P (accept H1|H1 is true)
----the chance of correctly identify H1
(correctly identify a better medicine)
11. What is δ?
δ is the minimum difference between groups
that is judged to be clinically important
Minimal effect which has clinical relevance in
the management of patients
The anticipated effect of the new treatment
(larger)
12. The Choice of α and β depend on:
the medical and practical consequences of the two
kinds of errors
prior plausibility of the hypothesis
the desired impact of the results
13. The Choice of α and β
α=0.10 and β=0.2 for preliminary trials
that are likely to be replicated.
α=0.01 and β=0.05 for the trial that are
unlikely replicated.
α=β if both test and control treatments are
new, about equal in cost, and there are
good reasons to consider them both
relatively safe.
14. The Choice of α and β
α>β if there is no established control treatment
and test treatment is relatively inexpensive, easy
to apply and is not known to have any serious side
effects.
α<β (the most common approach 0.05 and
0,2)if the control treatment is already widely used
and is known to be reasonably safe and effective,
whereas the test treatment is new,costly, and
produces serious side effects.
15. 1.2 SSC for Continuous Outcome
Variables
H0: δ=µC-µI=0
HA: δ=µC-µI≠0
If the variance in known
If
z=
(x −x )c I
1 1
σ +
NC N I
Z > Zα
If H0 will be rejected at the
α level of significance.
16. A total sample 2N would be needed to
detect a true difference δ between µI and µC
with power (1-β) and significant level α by
formula:
2N =
( )
4 Zα + Z β σ 2
2
δ2
17. Example 1
An investigator wish to estimate the sample size
necessary to detect a 10 mg/dl difference in
cholesterol level in a diet intervention group
compared to the control group. The variance from
other data is estimated to be (50 mg/dl). For a two
sided 5% significance level, Zα=1.96, and for 90%
power, Zβ=1.282.
2N=4(1.96+1.282)2(50)2/102=1050
18. Example1a
Baseline Adjustment
An investigator interested in the mean levels of
change might want to test whether diet
intervention lowers serum cholesterol from
baseline levels when compare with a control.
H0: =0 ∆ I
∆c −
HA: ∆ c − ∆ I ≠0
σ=20mg/dl, δ=10mg/dl
2N=4(1.96+1.282)2(20)2/102=170
19. A Professional Statement
A sample size of 85 in each group will
have 90% power to detect a difference
in means of 10.0 assuming that the
common standard deviation is 20.0
using a two group t-test with a 0.05
two-sided significant level.
20. Values of f(α,β) to be used in formula
for sample size calculation
β(Type II error)
α 0.05 0.1 0.2 0.5
(Type I 0.1 10.8 8.6 6.2 2.7
error) 0.05 13.0 10.5 7.9 3.8
0.02 15.8 13.0 10.0 5.4
0.01 17.8 14.9 11.7 6.6
(Z α +Z β )
2
= f (α β)
,
21. 1.3 SSC for a Binary Outcome
Two independent samples
1 1
Z = ( pC − pI ) / p (1 − p ) N +
C NI
p = ( rI + rC ) /( N I + N C )
22. p = ( pC + pI ) / 2
2 N = 4( Zα + Z β ) p (1 − p ) / ( pC − pI )
2 2
23. Example 2
Suppose the annual event rate in the
control group is anticipated to be 20%. The
investigator hopes that the intervention
will reduce the annual rate to 15%. The
study is planned so that each participant
will be followed for 2 years. Therefore, if
the assumption are accurate,
approximately 40% of the participants in
the control group and 30% of the
participants in the intervention group will
develop an event.
25. A Professional Statement
A two group x2 test with a 0.05 two-
sided significant level will have 90%
power to detect the difference between a
Group 1 proportion, P1,of 0.40 and a
Group 2 proportion P2 of 0.30 (odds
ratio of 0.643) when the sample size in
each group is 480.
26. Table 1.3 Approximate total sample size for comparing
various proportions in two groups with significance level (α)
of 0.05
and power(1-β) of 0.8 and 0.9
True proportions α=0.05(one-sided) α=0.05(two-sided)
pC pI 1-β 1-β 1-β 1-β
Control group Intervention 0.90 0.80 0.90 0.80
group
0.6 0.50 850 610 1040 780
0.40 210 160 260 200
0.30 90 70 120 90
0.20 50 40 60 50
0.50 0.40 850 610 1040 780
0.30 210 150 250 190
0.25 130 90 160 120
0.20 90 60 110 80
0.40 0.30 780 560 960 720
0.25 330 240 410 310
0.20 180 130 220 170
0.30 0.20 640 470 790 590
0.15 270 190 330 250
0.10 140 100 170 130
0.20 0.15 1980 1430 2430 1810
0.10 440 320 540 400
0.05 170 120 200 150
0.10 0.05 950 690 1170 870
27. From Table 1.3 You can see:
δ↑→N↓
The power 1- β↑→N ↑
The α↓→N ↑
28. Paired Binary Outcome
McNemar’s test
Np =
[Z α + Zβ ] 2
f
2
d
d=difference in the proportion of successes
(d=pI-pC)
f=the portion of participants whose response is
discordant (the pair of outcome are not the
same)
29. Example 3
Consider an eye study where one eye
is treated for loss in visual acuity by a
new laser procedure and the other
eye is treated by standard therapy.
The failure rate on the control, pC, is
estimated to be 0.4, and the new
procedure is projected to reduce the
failure rate to 0.20. The discordant
rate f is assumed to be 0.50.
31. 1.4 Adjusting for Non-adherence
Ro =drop out rate
RI=drop in rate
/ (1 − RO − RI )
2
N∗=N
If RO=0.20, RI=0.05
N ∗=1.78N
32. 1.5 Adjusting the Multiple Comparison
α’= α/k
k= the number of multiple comparison
variables
33. Table 1.4 Adjusting for Randomization Ratio
Randomization Ratio Increase in total N
1:1 0
1:2 +12.5%
1:3 +33%
1:4 +56%
1:5 +80%
1:6 +100%
34. 1.6 Adjusting for loss of follow up
If p is the proportion of subjects lost to
follow-up, the number of subjects must be
increased by a factor of 1/(1-p).
35. 1.7 Other Factors:
the rate of attrition of subjects during
a trial
intermediate analyses
36. Sample size re-estimation
Events rates are lower than
anticipate
Variability of larger than expected
Without unbinding data and
Making treatment comparisons
37. 1.8 Power Calculation
(assuming we compare two medicines)
Power Depends on 4 Elements:
The real difference between the two
medicines, δ
• Big δ⇒big power
The variation among individuals,σ
• Small σ⇒big power
The sample size, n
• Large n⇒big power
Type I error,α
• Large α ⇒big power
38. Sensitivity of the sample size
estimate
to a variety of deviations from these
assumptions
a power table
39. Table 1 Statistical Power of the Tanzania
Vitamin and HIV Infection Trial (N=960)
Effect of B
0% 15% 30%
Effect of A Loss to follow up Loss to follow up Loss to follow up
0% 20% 33% 0% 20% 33% 0% 20% 33%
30% 89% 82% 74% 85% 76% 68% 79% 69% 61%
25% 75% 65% 58% 69% 59% 52% 62% 52% 45%
40. Example 4
Regret for Low Power Due to Small
Sample?
I have a set of data that the mean change
between the 2 groups is significantly
different (p<0.05). But when I put
calculate the power it gives only 50%.
How should I interpret this? Also, can
someone kindly advise as whether it is
meaningful (or pointless) to calculate the
power when the result is statistically
significant?
41. Books and Software
Sample size tables for clinical
studies (second edition)
By David Machin, Michael Campbell Peter Fayers
and Alain Pinol
Blackwell Science 1997
PASS 2000 available in CCTER
nQuery 4.0 available in CCTER
43. Randomization
Definition:
randomization is a process by which each
participant has the same chance of being
assigned to either intervention or control.
44. Fundamental Point
Randomization trends to produce study
groups comparable with respect to known
and unknown risk factors, removes
investigator bias in the allocation of
participants, and guarantees that statistical
tests will have valid significance levels.
45. Two Types of Bias in Randomization
Selection bias
occurs if the allocation process is predictable. If any
bias exists as to what treatment particular types of
participants should receive, then a selection bias
might occur.
Accidental bias
can arise if the randomization procedure does not
achieve balance on risk factors or prognostic
covariates especially in small studies.
46. Fixed Allocation Randomization
Fixedallocation randomization procedures
assign the intervention to participants with
a pre-specified probability, usually equal,
and that allocation probability is not altered
as the study processes
• Simple randomization
• Blocked randomization
• Stratified randomization
48. Simple Randomization
Option 1: to toss an unbiased coin for a randomized
trial with two treatment (call them A and B)
Option 2: to use a random digit table. A randomization
list may be generated by using the digits, one per
treatment assignment, starting with the top row and
working downwards:
Option 3: to use a random number-producing
algorithm, available on most digital computer systems.
49. Advantages
Each treatment assignment is completely
unpredictable, and probability theory
guarantees that in the long run the numbers
of patients on each treatment will not be
radically different and easy to implement
50. Disadvantages
Unequal groups
one treatment is assigned more often than
another
Time imbalance or chronological bias
One treatment is given with greater frequency
at the beginning of a trial and another with
greater frequency at the end of the trial.
Simple randomization is not often used, even for
large studies.
52. Blocked Randomization
(permuted block randomization)
Blocked randomization is to ensure exactly
equal treatment numbers at certain equally
spaced point in the sequence of patients
assignments
A table of random permutations is used
containing, in random order, all possible
combinations (permutations) of a small series of
figures.
Block size: 6,8,10,16,20.
53. Advantages
The balance between the number of
participants in each group is guaranteed
during the course of randomization. The
number in each group will never differ by
more than b/2 when b is the length of the
block.
54. Disadvantages
Analysis may be more complicated (in
theory)
Correct analysis could have bigger power
Changing block size can avoid the
randomization to be predictable
Mid-block inequality might occur if the interim
analysis is intended.
55. Randomization Types
Stratified randomization
geographic
U .S . E u ro p e
location
previous
exposure Yes No Yes No
site
l y m p h s k i n b re a s t l y m p h s k i n b re a st l y m p h s k i n b re a s t l y m p h s k i n b re a s t
56. Stratified Randomization
Stratified randomization process involves
measuring the level of the selected factors for
participants, determining to which stratum each
belongs, and performing the randomization within
the stratum. Within each stratum, the
randomization process itself could be simple
randomization, but in practice most clinical trials
use some blocked randomization strategy.
57. Table 3. Stratification Factors and Levels
(3×2×3=18 Strata)
Age Sex Smoking history
1. 40-49 yr 1.Male 1. Current smoker
2. 50-59 yr 2 Female 2. Ex-smoker
3. 60-69 yr 3. Never smoked
58. Table 4 Stratified Randomization with Block Size of Four
Strat Age Sex Smoking Group assignment
a
1 40-49 M Current ABBA BABA..
2 40-49 M Ex BABA BBAA..
3 40-49 M Never Etc.
4 40-49 F Current
5 40-49 F Ex
6 40-59 F Never
7 50-59 M Current
8 50-59 M Ex
9 50-59 M Never
10 50-59 F Current
11 50-59 F Ex
12 50-59 F Never
etc.
59. Advantages
Tomake two study groups appear
comparable with regard to specified factors,
the power of the study can be increased by
taking the stratification into account in the
analysis.
60. Disadvantages
The prognostic factor used in stratified
randomization may be unimportant and other
factors may be identified later are of more
importance
61. Mechanism
Trial Type Mechanism
No central registration office Randomization list
sealed envelops
Double blind drug trial Pharmacist will be involved
Multi-centre trial Central registration office
Single-centre trial Independent person
responsible for patients
registration and randomization
62. An Example of Stratified Randomization
Patients will be stratified according to the following
criteria:
1) Treatment center (Hospital A vs Hospital B vs
Hospital C)
2) N-stage(N2 vs N3)
3) T-stage (T1-2 vs T3-4)
63. What should be in the protocol?
A dynamic allocation scheme will be used to
randomize patients in equal proportions within
each of 12 strata. The scheme first creates time-
ordered blocks of size divisible by three and then
uses simple randomization to divide the patients
in each block into three treatment arms, in equal
proportion. The block sizes will be chosen
randomly so that each block contains either 6 or
9 patients.
64. Cont…
This procedure helps to ensure both
randomness and investigator blinding (the block
sizes are known only to the statistician), as
recommended by Freedman et al.
Randomization will be generated by the
consulting statistician in sealed envelopes,
labeled by stratum, which will be unsealed after
patient registration.
66. Biased Coin Method
Advantages
Investigators can not determine the next
assignment by discovery the blocking
factor.
Disadvantages
Complexity in use
Statistical analysis cumbersome
67. Minimization
Minimization is an well -accepted statistical
method to limit imbalance in relative small
randomized clinical trials in conditions with
known important prognostic baseline
characteristics.
It called minimization because imbalance in
the distribution of prognostic factors are
minimized
68. Table 1 Some baseline characteristics of patients in a controlled trial
of mustine versus talc in the control of pleural effusions
in patients with breast cancer (Frientiman et al, 1983)
Treatment
Mustine (n=23) Talc(n=23)
Mean age (SE) 50.3(1.5) 55.3(2.2)
Stage of disease:
1 or 2 52% 74%
3 or 4 48% 26%
Mean interval in 33.1(6.2) 60.4(13.1)
month between BC
diag. and effusion
diag. (SE)
Postmenopausal 43% 74%
69. Minimization Factors
Age ( years) <=50 Or >50
Stage of disease 1 or 2 Or 3 or 4
Time between diagnosis <=30 Or >30
of cancer and diagnosis
of effusions(months)
Menopausal Pre Or Post
70. Table 2 Characteristics of the first 29 patients in a clinical
trial using minimization to allocate treatment
Mustine Talc
Age <=50 7 6
>50 8 8
Stage 1 or 2 11 11
3 or 4 4 3
Time <=30m 6 4
Interval >30m 9 10
Menopausal Pre 7 5
Post 8 9
71. Table 3 Calculation of imbalance in patient characteristics
for allocating treatment to the thirtieth patient
Mustine Talc
(n=15) (n=14)
Age >50 8 8
Stage 3 or 4 4 3
Time interval <=30m 6 4
Postmenopausal 8 9
Total 26 24
72. Advantages
It can reduce the imbalance into the minimum
level especially in small trial
Computer Program available (called Mini) and
also not difficult to perform ‘by hand’
Minimization and stratification on the same
prognostic factors produce similar levels of
power, but minimization may add slightly more
power if stratification does not include all of the
covariance
74. Practical Considerations
Study type Randomization
Large studies Blocked
Large, Multicentre studies Stratified by centre
Small studies Blocked and Stratified
by centre
Large number of Minimization
Prognostic factors
Large studies Stratified analysis
without stratified
randomization