Automating Google Workspace (GWS) & more with Apps Script
Study design2 6_07
1. 1
The Road Map to a
Successful Study Design
Lisa Kaltenbach, MS
Biostatistician II
February 6, 2007
Lisa.kaltenbach@vanderbilt.edu
Office: D2220 MCN
3. 3
Can’t go back in time in research!
“To call in the statistician after the experiment is
done may be no more than asking him to
perform a postmortem examination: he may
be able to say what the experiment dies of.”
-R.A. Fisher, Indian Statistical Congress,
Sankhya, ca 1938
4. 4
Outline: The Road to Success
• How to begin clinical research
• Important considerations when designing a
study
• Types of study designs
• Examples
5. 5
Components of a Study Protocol:
• The structure Element Purpose
of a research Research questions What questions will the study address?
project is set
out in its Significance (background) Why are these questions important?
protocol. Design How is the study structured?
• Protocols are Time frame
well known Epidemiologic approach
as devices
for seeking Subjects Who are the subjects and how will the
Selection criteria be selected?
grant funds,
but they also Sampling design
help the
Variables What measurements will be
investigator made/recorded?
Predictor variables
organize
Confounding variables
his/her
Outcome variables
research in a
logical,
Statistical issues How large is the study and how will it
focused, and be analyzed?
Hypotheses
efficient way. Sample size
Analytic approach
6. 6
What do you wish to learn?
• If only one question could be answered by the project, what would that question
be?
• Often people would like to do describe how likely a theory or hypothesis is in light
of a particular set of data. This is not possible in the commonly used
classical/frequentist approach to statistics. Instead, statistics talks about the
probability of observing particular sets of data, assuming a theory holds. We are
not allowed to say, "Because I've seen these data, there is only a small probability
that this theory is true." Instead, we say, "The probability of seeing data like these
is very small if the theory is true."
• In order to show an effect exists,
– statistics begins by assuming there is no effect.
– Prior to collecting data, rules are chosen to decide whether the data are consistent with
the assumption of no effect.
– If the data are found to be inconsistent with the assumption, the assumption must be
false and there is, in fact, an effect.
• Classical statistics works by comparing study data to what is expected when there
is nothing.
• If the data are not typical of what is seen when there is nothing, there must be
something. Usually "not typical" means that some summary of the data is so
extreme that it is seen less than 5% of the time when there is nothing.
7. 7
Judging a project's feasibility
• Can everything that needs to be measured be measured?
• If the study involves some condition, can we define and recognize it?
– What is an unhealthy eating behavior?
– What's the difference between a cold and the flu?
– What do we mean by family income or improved nutritional status?
• How accurate and consistent are the measurements? How accurate do
they need to be? What causes them to be inaccurate or inconsistent?
– Calcium intake is easy to measure because there are only a few major
sources of calcium. Salt intake is hard because salt is everywhere.
– Will respondents reveal their income?
– Can others get the same value (inter-laboratory, inter-technician variability)?
• How do we choose among different measurement techniques?
– Is a mechanical blood pressure cuff better than using a stethoscope?
– Is there a gold standard? Is it worth paying for?
• Sometimes merely measuring something changes it in unexpected ways.
– Does asking people to keep records of dietary intake cause them to change
their intake?
• Resources (time and money)
8. 8
Types of design
Analytic Descriptive
Experimental Non-experimental Community Survey
Randomized
Cohort
Clinical Trial
Non-randomized
Cross-sectional
Clinical Trial
Case-control
Other
9. 9
Considerations when choosing a Study Design
• No one approach is always better than the others.
• Each research question requires a judgment about
which design is the most efficient way to get a
satisfactory answer.
• A common sequence for studying a topic:
– Descriptive studies
• How common is estrogen treatment in women after menopause?
– Analytic studies to evaluate associations and discover
cause-and-effect relationships
• Is taking estrogen after menopause associated with lower risk of
CHD?
– Clinical trial to establish the effects of an intervention
• Does hormone treatment alter the incidence of CHD?
10. Examples of common clinical research designs used
10
to study whether hormone therapy after menopause
prevents coronary heart disease
Study Design Key Feature Example
Experimental Design
Randomized blinded Two groups created by a The investigator randomly assigns women to receive hormone
trial random process, and a or identical placebo, then follows both treatment groups for
blinded intervention several years to observe the incidence of heart attacks.
Observational Designs
Cohort study A group followed over time The investigator examines a cohort of women yearly for
several years, observing the incidence of heart attacks in
hormone users and non-users.
Case-control study Two groups, based on the The investigator examines a group of women with heart
outcome attacks (the “cases”) and compares them with a group of
healthy women (the controls) asking about hormone use.
Cross-sectional study A group examined at one The investigator examines the group of women once,
point in time observing the prevalence of a history of heart attacks in
hormone users and non-users.
11. 11
Statistical Issues
3 Step Process:
1. Define specific aim
-Hypothesis: Women who receive estrogen treatment after
menopause will have fewer heart attacks than those who do not.
2. Calculate the sample size, the number of
subjects needed to observe the expected difference
in outcome between study groups with a reasonable
degree of probability, or power.
3. Select statistical methods needed to produce an
acceptable level of precision when confidence
intervals are calculated for the means, proportions,
or other descriptive statistics.
12. 12
Randomized Clinical Trials
• In simplest implementation:
• Subject enrolls in study
• Randomly assigned to one of ≥ 2 treatments
• Followed up until end of study or outcome measure is obtained
• Outcome comparisons are made among treatment groups
• Treatment groups should be comparable on measured and
unmeasured covariates due to randomization
• Strongest design to establish causal relationships
• May be beneficial to blind subjects/investigators to treatment
groups
14. 14
Cohort Studies
• Exposure not randomly assigned, but assessed
– Sample selection and analysis can minimize confounding
• Need sufficient # of subjects/events
• Prospective Cohort
– Outcomes are future events
• Retrospective Cohort
– Outcomes have already occurred
• Can study multiple outcomes
• No control over risk factors, or insufficient numbers
16. 16
Cross-sectional Studies
• All variables are measured at same time
• Valuable for providing descriptive information
about prevalence
• But weaker evidence for causality as predictor
is not shown to precede outcome
18. 18
Case-Control
1. Subjects are identified as cases based on
outcome status
2. Identify comparable controls (challenging)
3. Retrospectively determine prior exposure
*Big challenge to account for all differences
between cases & controls that could explain
relationship between exposure & case
status
20. 20
Summary of how research works
RESEARCH STUDY ACTUAL
QUESTION PLAN
implement STUDY
design
Intended
Target sample Actual Subjects
population All women aged
Women aged 50-69 seen in
50-69
Errors UCSF primary care
clinic in one year
Errors
Errors Actual
Measurements
Intended variables
Phenomena Self reported
of interest estrogen treatment
The proportion
who take estrogen infer
FINDINGS IN THE
TRUTH IN THE TRUTH IN THE STUDY
UNIVERSE STUDY
21. 21
Sampling Errors: Threaten inferences from study subjects to
population of interest
• Random error is a wrong result due to chance – unknown sources of variation
that are equally likely to distort the sample in either direction.
– If the true prevalence of estrogen treatment in 50-to-69-year-old women is 20%, a well-
designed sample of 100 patients from that population might contain exactly 20 patients
with this disease. More likely, however, the sample would contain a nearby number
such as 18, 19, 21, or 22. Occasionally, chance would produce a substantially different
number, such as 12 or 28.
– Reduce the influence of random error by increasing the sample size. The use of a
larger sample diminishes the likelihood of a wrong result by increasing the precision of
the estimate - the degree to which the observed prevalence approximates 20% each
time a sample is drawn.
• Systematic error is a wrong result due to bias (sources of variation that distort
the study findings in one direction).
– Using patients who come to the primary care clinic, who might be more likely than
average to adopt medical treatments. Increasing the sample size has no effect on
systematic error. The only way to improve the accuracy of the estimate is to design the
study in a way that either reduces the size of the various biases or gives some
information about them. An example would be to draw a second sample of women from
a setting that may be less likely to bias the proportion of women treated with estrogen
(e.g., employees in a corporation), and to compare the observed prevalence in the two
samples.
22. 22
Summary
• Plan ahead!
• We all want to do research that produces
valid results, is worthy of publication, and
meets with the approval of our peers. This
begins with a carefully crafted research
question and an appropriate study design.
23. 23
References
• Dallal website
• Hulley, SB, et all. 2001, 2nd ed. Designing
Clinical Research, Lippincott Williams &
Williams; Philadelphia, PA.
• Wikipedia
Notas do Editor
The significance section of a protocol sets the proposed study in context and gives its rationale: What is known about the topic at hand? Why is the research question important? What kind of answers will the study provide? This section cites previous research that is relevant (including the investigator’s own work) and indicates the problems with that research and what questions remain. It makes clear how the findings of the proposed study will help resolve these uncertainties, leading to new scientific understanding and influencing clinical and public health policy Two major decisions must be made in choosing the study subjects (Chapter 3). The first is to specify the selection criteria that define the target population: the kinds of patients best suited to the research question. The second decision concerns how best to recruit enough women from an accessible aspect of this population who will be the actual subjects of the study. For example, the study of hormones and CHD in women might select women aged 50 to 69 years attending primary care clinic at the investigator’s hospital, and the investigator might decide to invite the next 1,000 such patients. These design choices represent trade-offs; studying a random sample of all U.S. women of that age would enhance generalizability but be formidably difficult and costly.
(Other variations are, "If the results were summarized at the top of the evening news in a phrase or two spoken in a few seconds, what would the reporter say?" or "If the results were written up in the local newspaper, what would the headline be?") Not only does this help a statistician better understand an investigator's goals The chances of winning a lottery are small, yet there's always a winner.
Peds child abuse, proxy measures. Can’t always measure what you want
A fundamental issue is whether to observe the events taking place in the study subjects in an observational study or to apply an intervention and examine its effects of these events in a clinical trial . Among observational studies, two of the most common designs are cohort studies, in which a group of subjects is follower over time, and cross-sectional studies, in which the observations are made on a single occasion. A third common option is the case-control design, in which the investigator compares a group of subjects who have a disease or condition with another group of subjects who do not. Types of design: Some of the most popular designs are sorted below, with the ones at the top being the most powerful at reducing observer-expectancy effect but also most expensive, and in some cases introducing ethical concerns. The ones at the bottom are the most affordable and are frequently used earlier in the research cycle, to develop strong hypotheses worth testing with more expensive research approaches. Experimental Randomized clinical trial Double-blind Single-blind Non-blind Non-randomized clinical trial Non-experimental Cohort study Prospective cohort Retrospective cohort Nested cohort Case-cohort study Case-control study (case series) Nested case-control study Cross-sectional study Descriptive Community survey Descriptive studies- used to study variation in frequency by demographic characteristics, place & time Analytic studies researcher has a pre-specified hypothesis Experimental Studies are conducted under controlled conditions where researcher manipulates some measure Observational studies do not involve intervention Observe natural course of events where changes in one characteristic is studied in association with changes in other characteristics Often necessary when unethical or infeasible to manipulate exposure Will give more details about types of study design with examples later in talk after we discuss some issues to consider when selecting a study design
When choosing a study design, many factors must be take into account. Different types of studies are subject to different types of bias. For example describing distributions of disease and health-related characteristics in the population Experiments usually occur later in the sequence of research studies, and answer more narrowly focused questions that arise from the findings of observational studies. No study is ever perfect
3 step process Define specific aim Sample size/power Stat methods Clinical sign versus stat sign. Clinical knowledge drives
Both random and systematic errors can also contribute to measurement error, threatening inferences from the study measurements to the phenomena of interest. An illustration of random measurement error is the variation in the response when a questionnaire is administered on several occasions.