Presented at the European Health Psychology Conference, July 13, 2013, This slideshow shows the folly of accepting positive findings from underpowered studies. Much of the "evidence" in health psychology comes from such unreliable studies.
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
The folly of believing positive findings from underpowered intervention studies
1. Too Good to Be True: Health
Psychology’s Dependence on
Underpowered Positive Studies
James C. Coyne, Ph.D.
University of Groningen,
University Medical Center
Groningen, The Netherlands
Twitter @CoyneoftheRealm
2. Long a pervasive problem…
Lack
of sufficient resources to conduct
well-designed, amply powered studies
Confusion
about pilot studies: Cannot be
the basis for evaluating efficacy or
estimating effect sizes!
3. “We are grateful to the Society of Behavioral
Medicine (SBM) for selecting the authorship
group. This article is one of three metaanalyses that have been undertaken under the
aegis of the SBM Evidence-Based Behavioral
Medicine Committee; the other two metaanalyses examine the effects of psychosocial
interventions on depression and fatigue among
patients with cancer.”
4. SBM Initiative
Meta-analyses generated by professional
organizations should receive special
critical scrutiny because of tenancy to
gloss over limits of literature in order to
promote the services of their membership.
5. Small Studies
Suffer
strong publication bias.
Negative
findings go unpublished because
the studies are too small.
Positive
findings celebrated because they
were obtained despite the smallness.
6. Small Studies
Require
a larger effect size for statistical
significance.
Published results tend to be exaggerated
and not to be replicated in larger and better
quality later studies.
7. Small Trials Likely to Have Outliers,
and With Publication Bias, Yield
Results That Won’t Replicate
Hospital A has 10 births per month on average.
Hospital B has 100 births per month on
average.
In January, one of the hospitals reported 70%
of the births were girls. Is it more likely in A, B,
or equally likely to be in either?
8. Small Studies
Are particularly vulnerable to selective loss of
patients to follow-up and to investigators, outcome
raters knowing to which condition patients are
assigned.
Investigators can naïvely or deliberately monitor
incoming data and stop the trial when a positive
finding has been obtained, even when it is a chance
finding that would be undone with continued
accumulation of patients.
9. Sample Size
Sample size is the best proxy for other sources of
bias in trials.
Sample size negatively predicts overall effect size.
In presence of small study effects, restriction of
analyses to large trials or predictions of treatment
benefits observed in large trials might provide more
valid estimates than overall analyses of trials,
irrespective of sample size.
10. Gorin, et al "Metaanalysis of psychosocial
interventions to reduce
pain in patients with
cancer." Journal of
Clinical Oncology 30: (5):
(2012): 539-547.
12. What the SBM Authors Claimed
about Psychosocial Interventions
for Cancer Pain
“Robust findings" of "substantial rigor" and “strong
evidence for psychosocial pain management
approaches."
Claimed findings supported the “systematic
implementation" of these techniques.
Estimated would take 812 unpublished studies
lurking in file drawers to change their assessment.
13. 19 of 38 studies had less than 35 patients in
the intervention or control group. Two of the
other largest trials should have been
excluded for other reasons.
Of 13 studies individually having significant
effects on pain severity, 8 would have been
excluded because they were too small, 1
because it should not have been included in
the first place.
14. For 4 studies having the largest effect sizes,
1 had only 20 patients receiving relaxation;
the next largest had 10 patients who were
hypnotized; the next, 20 patients listening to
the relaxation tape versus 20 patients
getting live instructions, but these numbers
were obtained by replacing patients who
dropped out.
Study with the fourth largest effect size had
15 patients receiving training in selfhypnosis.
15. Some of the studies quite
small
7 patients receiving pain education
10
patients receiving hypnosis
16
patients getting pain education
16
patients getting self hypnosis
8 patients getting relaxation plus 8
patients getting CBT plus relaxation
18. Hart, et al. "Meta-analysis of
efficacy of interventions for
elevated depressive
symptoms in adults
diagnosed with cancer."
Journal of the National
Cancer Institute 104:13
(2012): 990-1004.
20. 3 studies classified as “psychotherapeutic”
were complex collaborative care
interventions for depression emphasizing
medication management.
These studies provided the bulk [527] of the
patients in the authors' calculation of the
effect size for psychotherapeutic
intervention.
21. Of the 2 remaining studies, 1 randomly
assigned 45 patients to either problemsolving or waitlist control and retained only
37 patients for analyses.
Final study contributed 2 effect sizes based
on comparisons of 29 patients receiving
CBT and 23 receiving supportive therapy to
the same 26-patient no-treatment control
group, thus violating the assumption of
independence of effect sizes.
22. With Removal of Small and
Inappropriately Classified Studies
No Eligible Studies Were Left
23. Fail-safe N of 106 confirms the relative
stability of the observed effect size.
“Our findings advance this literature
by demonstrating that psychological
and pharmacologic approaches,
evaluated in RCTs, can be targeted
productively toward cancer patients in
need of intervention by virtue of
clinical depression or elevated
depressive symptoms.”
24. Fail Safe N is Pseudo-Precise
Nonsense
Don’t Be Intimidated by Exaggerated
Estimates of Number of Unpublished
Studies Needed to Unseat Conclusions
Based on Meta Analysis of Underpowered
Studies.
25. Deficiencies of Failsafe N
Combining
Z scores does not directly
account for sample sizes of the studies.
Choice
of zero for the average effect of
the unpublished studies is arbitrary,
almost certainly biased.
Allowing
for unpublished negative studies
substantially reduces failsafe N.
26. Deficiencies of Failsafe N
Estimates
of failsafe N not influenced by
evidence of bias in the data.
Guesswork to estimate the magnitude of
unpublished studies in the area.
Heterogeneity among the studies is
ignored.
Method is not influenced by the shape of
the funnel graph.
27. Are Small, Unpowered Studies
Good for Anything?
Leon, Andrew C., Lori L. Davis, and
Helena C. Kraemer. The role and
interpretation of pilot studies in clinical
research. Journal of Psychiatric Research
45:5 (2011): 626-629.
28. A pilot study is not a
hypothesis testing study.
Efficacy and effectiveness are
not evaluated in a pilot.
29. A pilot study does not provide a
meaningful effect size estimate for
planning subsequent studies due to
the imprecision inherent in data from
small samples.
Feasibility results do not necessarily
generalize beyond the inclusion and
exclusion criteria of the pilot design.
.
Notas do Editor
Forest plot of effect sizes (g) for studies measuring pain severity (k = 38).
. Forest plot of effect sizes (Hedges’ g, designated g in the figure) for trials included in the meta-analysis (58–62,72–75). The corresponding 95% CI (designated “Lower” and “Upper” and indicated graphically by whisker bars) are also given. Effect sizes for the trials containing two intervention groups are displayed separately (59,62). CBT = cognitive behavioral therapy; CI = confidence interval; D = desipramine; P = paroxetine; SS = social support.