31st july talk (20021)

Clinica l Tria l Writing II
S a mple S ize Ca lcula tion a nd
Ra ndomiza tion

Liying XU (Tel: 22528716)
CCTER
CUHK
31st July 2002

1.1 Introduction
 Fundamental Points
 Clinical trials should have sufficient
statistical power to detect difference
between groups considered to be of
clinical interest. Therefore calculation
of sample size with provision for
adequate levels of significance and power
is a essential part of planning.

Five Key Questions Regarding the Sample
Size
 What is the main purpose of the trial?
 What is the principal measure of patients
outcome?
 How will the data be analyzed to detect a
treatment difference? (The test statistic: t-test ,
X2 or CI.)
 What type of results does one anticipate with
standard treatment?
 Ho and HA, How small a treatment difference is
it important to detect and with what degree of
certainty? ( δ, α and β.)
 How to deal with treatment withdraws and
protocol violations. (Data set used.)

SSC: Only an Estimate
 Parameters used in calculation are
estimates with uncertainty and often
base on very small prior studies
 Population may be different
 Publication bias--overly optimistic
 Different inclusion and exclusion
criteria
 Mathematical models approximation

What should be in the protocol?
 Sample size justification
 Methods of calculation
 Quantities used in calculation:

• Variances
• mean values
• response rates
• difference to be detected

Realistic and Conservative
 Overestimated size:
 unfeasible
 early termination

 Underestimated size
 justify an increase
 extension in follow-up
 incorrect conclusion (WORSE)

What is α (Type I error)?
 The probability of erroneously
rejecting the null hypothesis
 (Put an useless medicine into the
market!)

What is β (Type II error)?
 The probability of erroneously failing
to reject the null hypothesis.
 (keep a good medicine away from
patients!)

What is Power ?
 Power quantifies the ability of the
study to find true differences of
various values of δ.

 Power = 1- β=P (accept H1|H1 is true)
 ----the chance of correctly identify H1
(correctly identify a better medicine)

What is δ?
 δ is the minimum difference between groups
that is judged to be clinically important
 Minimal effect which has clinical relevance in
the management of patients
 The anticipated effect of the new treatment
(larger)

The Choice of α and β depend on:

 the medical and practical consequences of the two
kinds of errors
 prior plausibility of the hypothesis
 the desired impact of the results

The Choice of α and β
 α=0.10 and β=0.2 for preliminary trials
that are likely to be replicated.
 α=0.01 and β=0.05 for the trial that are
unlikely replicated.
 α=β if both test and control treatments are
new, about equal in cost, and there are
good reasons to consider them both
relatively safe.

The Choice of α and β
 α>β if there is no established control treatment
and test treatment is relatively inexpensive, easy
to apply and is not known to have any serious side
effects.
 α<β (the most common approach 0.05 and
0,2)if the control treatment is already widely used
and is known to be reasonably safe and effective,
whereas the test treatment is new,costly, and
produces serious side effects.

1.2 SSC for Continuous Outcome
Variables
 H0: δ=µC-µI=0
 HA: δ=µC-µI≠0
 If the variance in known
 If
z=
(x −x )c I

1 1
σ +
NC N I

Z > Zα
 If H0 will be rejected at the
α level of significance.

 A total sample 2N would be needed to
detect a true difference δ between µI and µC
with power (1-β) and significant level α by
formula:

2N =
( )
4 Zα + Z β σ 2
2

δ2

Example 1
 An investigator wish to estimate the sample size
necessary to detect a 10 mg/dl difference in
cholesterol level in a diet intervention group
compared to the control group. The variance from
other data is estimated to be (50 mg/dl). For a two
sided 5% significance level, Zα=1.96, and for 90%
power, Zβ=1.282.

 2N=4(1.96+1.282)2(50)2/102=1050

Example1a
Baseline Adjustment
 An investigator interested in the mean levels of
change might want to test whether diet
intervention lowers serum cholesterol from
baseline levels when compare with a control.
 H0: =0 ∆ I
∆c −

 HA: ∆ c − ∆ I ≠0
 σ=20mg/dl, δ=10mg/dl
 2N=4(1.96+1.282)2(20)2/102=170

A Professional Statement
 A sample size of 85 in each group will
have 90% power to detect a difference
in means of 10.0 assuming that the
common standard deviation is 20.0
using a two group t-test with a 0.05
two-sided significant level.

Values of f(α,β) to be used in formula
for sample size calculation
β(Type II error)
α 0.05 0.1 0.2 0.5
(Type I 0.1 10.8 8.6 6.2 2.7
error) 0.05 13.0 10.5 7.9 3.8
0.02 15.8 13.0 10.0 5.4
0.01 17.8 14.9 11.7 6.6

(Z α +Z β )
2
= f (α β)
,

1.3 SSC for a Binary Outcome
 Two independent samples

 1 1 
Z = ( pC − pI ) / p (1 − p )  N + 
 C NI 

p = ( rI + rC ) /( N I + N C )

p = ( pC + pI ) / 2

2 N = 4( Zα + Z β ) p (1 − p ) / ( pC − pI )
2 2

Example 2
 Suppose the annual event rate in the
control group is anticipated to be 20%. The
investigator hopes that the intervention
will reduce the annual rate to 15%. The
study is planned so that each participant
will be followed for 2 years. Therefore, if
the assumption are accurate,
approximately 40% of the participants in
the control group and 30% of the
participants in the intervention group will
develop an event.

2 N = 4(1.96 + 1.282 ) (0.35)(0.65) / ( 0.4 − 0.3)
2 2

= 956 ≈ 960

A Professional Statement
 A two group x2 test with a 0.05 two-
sided significant level will have 90%
power to detect the difference between a
Group 1 proportion, P1,of 0.40 and a
Group 2 proportion P2 of 0.30 (odds
ratio of 0.643) when the sample size in
each group is 480.

Table 1.3 Approximate total sample size for comparing
various proportions in two groups with significance level (α)
of 0.05
and power(1-β) of 0.8 and 0.9

True proportions α=0.05(one-sided) α=0.05(two-sided)
pC pI 1-β 1-β 1-β 1-β
Control group Intervention 0.90 0.80 0.90 0.80
group
0.6 0.50 850 610 1040 780
0.40 210 160 260 200
0.30 90 70 120 90
0.20 50 40 60 50
0.50 0.40 850 610 1040 780
0.30 210 150 250 190
0.25 130 90 160 120
0.20 90 60 110 80
0.40 0.30 780 560 960 720
0.25 330 240 410 310
0.20 180 130 220 170
0.30 0.20 640 470 790 590
0.15 270 190 330 250
0.10 140 100 170 130
0.20 0.15 1980 1430 2430 1810
0.10 440 320 540 400
0.05 170 120 200 150
0.10 0.05 950 690 1170 870

From Table 1.3 You can see:
 δ↑→N↓
 The power 1- β↑→N ↑
 The α↓→N ↑

Paired Binary Outcome
 McNemar’s test

Np =
[Z α + Zβ ] 2
f
2
d
 d=difference in the proportion of successes
(d=pI-pC)
 f=the portion of participants whose response is
discordant (the pair of outcome are not the
same)

Example 3
 Consider an eye study where one eye
is treated for loss in visual acuity by a
new laser procedure and the other
eye is treated by standard therapy.
The failure rate on the control, pC, is
estimated to be 0.4, and the new
procedure is projected to reduce the
failure rate to 0.20. The discordant
rate f is assumed to be 0.50.

 α=0.05
 The power 1- β=0.90
 f=0.5

 PC=0.4 PI=0.2

Np =
(1.96 + 1.282) ( 0.5) = 262 × 0.5 = 132
2

( 0.4 − 0.2) 2

1.4 Adjusting for Non-adherence
 Ro =drop out rate
 RI=drop in rate
/ (1 − RO − RI )
2
 N∗=N

 If RO=0.20, RI=0.05
 N ∗=1.78N

1.5 Adjusting the Multiple Comparison
 α’= α/k

 k= the number of multiple comparison
variables

Table 1.4 Adjusting for Randomization Ratio

Randomization Ratio Increase in total N
1:1 0
1:2 +12.5%
1:3 +33%
1:4 +56%
1:5 +80%
1:6 +100%

1.6 Adjusting for loss of follow up
 If p is the proportion of subjects lost to
follow-up, the number of subjects must be
increased by a factor of 1/(1-p).

1.7 Other Factors:
 the rate of attrition of subjects during
a trial
 intermediate analyses

Sample size re-estimation
 Events rates are lower than
anticipate
 Variability of larger than expected

 Without unbinding data and
 Making treatment comparisons

1.8 Power Calculation
(assuming we compare two medicines)
 Power Depends on 4 Elements:
 The real difference between the two
medicines, δ
• Big δ⇒big power
 The variation among individuals,σ

• Small σ⇒big power
 The sample size, n

• Large n⇒big power
 Type I error,α

• Large α ⇒big power

Sensitivity of the sample size
estimate
 to a variety of deviations from these
assumptions

 a power table

Table 1 Statistical Power of the Tanzania
Vitamin and HIV Infection Trial (N=960)

Effect of B
0% 15% 30%

Effect of A Loss to follow up Loss to follow up Loss to follow up

0% 20% 33% 0% 20% 33% 0% 20% 33%

30% 89% 82% 74% 85% 76% 68% 79% 69% 61%

25% 75% 65% 58% 69% 59% 52% 62% 52% 45%

Example 4
Regret for Low Power Due to Small
Sample?
 I have a set of data that the mean change
between the 2 groups is significantly
different (p<0.05). But when I put
calculate the power it gives only 50%.
How should I interpret this? Also, can
someone kindly advise as whether it is
meaningful (or pointless) to calculate the
power when the result is statistically
significant?

Books and Software
 Sample size tables for clinical
studies (second edition)
 By David Machin, Michael Campbell Peter Fayers
and Alain Pinol
 Blackwell Science 1997
 PASS 2000 available in CCTER
 nQuery 4.0 available in CCTER

Randomization
 Definition:
 randomization is a process by which each
participant has the same chance of being
assigned to either intervention or control.

Fundamental Point
 Randomization trends to produce study
groups comparable with respect to known
and unknown risk factors, removes
investigator bias in the allocation of
participants, and guarantees that statistical
tests will have valid significance levels.

Two Types of Bias in Randomization
 Selection bias
 occurs if the allocation process is predictable. If any
bias exists as to what treatment particular types of
participants should receive, then a selection bias
might occur.
 Accidental bias
 can arise if the randomization procedure does not
achieve balance on risk factors or prognostic
covariates especially in small studies.

Fixed Allocation Randomization
 Fixedallocation randomization procedures
assign the intervention to participants with
a pre-specified probability, usually equal,
and that allocation probability is not altered
as the study processes
• Simple randomization
• Blocked randomization
• Stratified randomization

Randomization Types
 Simple randomization

Simple Randomization
 Option 1: to toss an unbiased coin for a randomized
trial with two treatment (call them A and B)
 Option 2: to use a random digit table. A randomization
list may be generated by using the digits, one per
treatment assignment, starting with the top row and
working downwards:
 Option 3: to use a random number-producing
algorithm, available on most digital computer systems.

Advantages
 Each treatment assignment is completely
unpredictable, and probability theory
guarantees that in the long run the numbers
of patients on each treatment will not be
radically different and easy to implement

Disadvantages
 Unequal groups
 one treatment is assigned more often than
another
 Time imbalance or chronological bias
 One treatment is given with greater frequency
at the beginning of a trial and another with
greater frequency at the end of the trial.
 Simple randomization is not often used, even for
large studies.

Randomization Types

 Blocked randomization

Blocked Randomization
(permuted block randomization)
 Blocked randomization is to ensure exactly
equal treatment numbers at certain equally
spaced point in the sequence of patients
assignments
 A table of random permutations is used
containing, in random order, all possible
combinations (permutations) of a small series of
figures.
 Block size: 6,8,10,16,20.

Advantages
 The balance between the number of
participants in each group is guaranteed
during the course of randomization. The
number in each group will never differ by
more than b/2 when b is the length of the
block.

Disadvantages
 Analysis may be more complicated (in
theory)
 Correct analysis could have bigger power

 Changing block size can avoid the
randomization to be predictable
 Mid-block inequality might occur if the interim
analysis is intended.

Randomization Types
 Stratified randomization

geographic
U .S . E u ro p e
location

previous
exposure Yes No Yes No

site
l y m p h s k i n b re a s t l y m p h s k i n b re a st l y m p h s k i n b re a s t l y m p h s k i n b re a s t

Stratified Randomization
 Stratified randomization process involves
measuring the level of the selected factors for
participants, determining to which stratum each
belongs, and performing the randomization within
the stratum. Within each stratum, the
randomization process itself could be simple
randomization, but in practice most clinical trials
use some blocked randomization strategy.

Table 3. Stratification Factors and Levels
(3×2×3=18 Strata)

Age Sex Smoking history

1. 40-49 yr 1.Male 1. Current smoker

2. 50-59 yr 2 Female 2. Ex-smoker

3. 60-69 yr 3. Never smoked

Table 4 Stratified Randomization with Block Size of Four
Strat Age Sex Smoking Group assignment
a
1 40-49 M Current ABBA BABA..
2 40-49 M Ex BABA BBAA..
3 40-49 M Never Etc.
4 40-49 F Current
5 40-49 F Ex
6 40-59 F Never
7 50-59 M Current
8 50-59 M Ex
9 50-59 M Never
10 50-59 F Current
11 50-59 F Ex
12 50-59 F Never
etc.

Advantages
 Tomake two study groups appear
comparable with regard to specified factors,
the power of the study can be increased by
taking the stratification into account in the
analysis.

Disadvantages
 The prognostic factor used in stratified
randomization may be unimportant and other
factors may be identified later are of more
importance

Mechanism
Trial Type Mechanism

No central registration office Randomization list
sealed envelops
Double blind drug trial Pharmacist will be involved

Multi-centre trial Central registration office

Single-centre trial Independent person
responsible for patients
registration and randomization

An Example of Stratified Randomization
 Patients will be stratified according to the following
criteria:
 1) Treatment center (Hospital A vs Hospital B vs
Hospital C)
 2) N-stage(N2 vs N3)
 3) T-stage (T1-2 vs T3-4)

What should be in the protocol?
 A dynamic allocation scheme will be used to
randomize patients in equal proportions within
each of 12 strata. The scheme first creates time-
ordered blocks of size divisible by three and then
uses simple randomization to divide the patients
in each block into three treatment arms, in equal
proportion. The block sizes will be chosen
randomly so that each block contains either 6 or
9 patients.

Cont…
 This procedure helps to ensure both
randomness and investigator blinding (the block
sizes are known only to the statistician), as
recommended by Freedman et al.
Randomization will be generated by the
consulting statistician in sealed envelopes,
labeled by stratum, which will be unsealed after
patient registration.

Adaptive Randomization
 Number adaptive
 Biased coin method

 Baseline adaptive (MINIMIZATION)
 Outcome adaptive

Biased Coin Method
 Advantages
 Investigators can not determine the next
assignment by discovery the blocking
factor.
 Disadvantages
 Complexity in use
 Statistical analysis cumbersome

Minimization
 Minimization is an well -accepted statistical
method to limit imbalance in relative small
randomized clinical trials in conditions with
known important prognostic baseline
characteristics.
 It called minimization because imbalance in
the distribution of prognostic factors are
minimized

Table 1 Some baseline characteristics of patients in a controlled trial
of mustine versus talc in the control of pleural effusions
in patients with breast cancer (Frientiman et al, 1983)
Treatment
Mustine (n=23) Talc(n=23)
Mean age (SE) 50.3(1.5) 55.3(2.2)

Stage of disease:
1 or 2 52% 74%
3 or 4 48% 26%
Mean interval in 33.1(6.2) 60.4(13.1)
month between BC
diag. and effusion
diag. (SE)
Postmenopausal 43% 74%

Minimization Factors
Age ( years) <=50 Or >50

Stage of disease 1 or 2 Or 3 or 4

Time between diagnosis <=30 Or >30
of cancer and diagnosis
of effusions(months)

Menopausal Pre Or Post

Table 2 Characteristics of the first 29 patients in a clinical
trial using minimization to allocate treatment
Mustine Talc

Age <=50 7 6
>50 8 8

Stage 1 or 2 11 11
3 or 4 4 3

Time <=30m 6 4
Interval >30m 9 10

Menopausal Pre 7 5
Post 8 9

Table 3 Calculation of imbalance in patient characteristics
for allocating treatment to the thirtieth patient

Mustine Talc
(n=15) (n=14)
Age >50 8 8
Stage 3 or 4 4 3
Time interval <=30m 6 4
Postmenopausal 8 9
Total 26 24

Advantages
 It can reduce the imbalance into the minimum
level especially in small trial
 Computer Program available (called Mini) and
also not difficult to perform ‘by hand’
 Minimization and stratification on the same
prognostic factors produce similar levels of
power, but minimization may add slightly more
power if stratification does not include all of the
covariance

Disadvantages
 It is a
bit complicated process compare to the
simple randomization

Practical Considerations
Study type Randomization
Large studies Blocked
Large, Multicentre studies Stratified by centre

Small studies Blocked and Stratified
by centre
Large number of Minimization
Prognostic factors
Large studies Stratified analysis
without stratified
randomization

31st july talk (20021)

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (7)

Semelhante a 31st july talk (20021)

Semelhante a 31st july talk (20021) (20)

Mais de Dr Vijay Pithadia Director

Mais de Dr Vijay Pithadia Director (20)

31st july talk (20021)