Dichotomania and other challenges for the collaborating biostatistician

Dichotomania
and other challenges for the collaborating biostatistician
A perspective on principles, responsibilities and potential solutions
Laure Wynants PhD
laure.wynants@maastrichtuniversity.nl
@laure_wynants

Dichotomania
and other challenges for the collaborating biostatistician
A perspective on principles, responsibilities and potential solutions
Laure Wynants PhD
laure.wynants@maastrichtuniversity.nl
@laure_wynants
or teaching

2018 conference invitation
“Statistics made very easy for clinicians”

2018 conference invitation
“Statistics made very easy for clinicians”
“Free coffee and croissants while statisticians explain the P-
value crisis”

Goodman S. A dirty dozen: twelve p-value misconceptions. Semin Hematol. 2008 Jul;45(3):135-40.

6
Estimated probability of causal attribution according to the null P-value, modeled using fractional polynomials with a cutpoint at P = 0.05.
A Psychometric Experiment in Causal Inference to Estimate Evidential Weights Used by Epidemiologists
Holman, C. D’Arcy J.; Arnold-Reed, Diane E.; de Klerk, Nicholas; McComb, Christine; English, Dallas R., Epidemiology12(2):246-255, 2001.
https://xkcd.com/

A conversation between a researcher and a statistician
• R: “We need some statistical testing for these plots.”
• S: “Why? These are not your main research questions
in this paper.”
• R: “I am not fishing for significant findings. I am aware
of the dangers. These are hypotheses we investigated
in earlier work. If the tests are significant, we know this
is confirmed in our new data.”

A conversation between a researcher and a statistician
• R: “We need some statistical testing for these plots.”
• S: “Why? These are not your main research questions
in this paper.”
• R: “I am not fishing for significant findings. I am aware
of the dangers. These are hypotheses we investigated
in earlier work. If the tests are significant, we know this
is confirmed in our new data.”
• R: “If it is not significant, we will discuss further. We
just didn’t have enough power then.”

Reporting and publication bias
“Trim and fill” funnel plot of Ki-67 expression for overall
survival in ovarian cancer patients (Qiu et al. Arch
Gynecol Obstet 2019)
Missing studies Published studies

Science as a disorderly mass of stray
observations, inconclusive results and
fledgling explanations.
And yet, as soon as their hypotheses were
turned into peer-reviewed papers,
researchers claimed that such facts had
always spoken for themselves.

Replication crisis
Ioannidis JAMA 2005: all original research published in 3 major journals in 1990-2003 and cited >1000 times:
49 studies, 45 claimed that the intervention was effective.
32% could not be replicated:
16% was contradicted (no effect found)
16% estimated effects were too strong (to the point that subsequent studies cast doubt on effect being clinically
important)
44% were replicated
11% no subsequent larger/better designed replication studies
Problems were worst for small RCTS and non-randomized studies.
Begly & Ellis Nature 2012: replication of landmark preclinical cancer studies: only 11% could be reproduced.
Journal impact factor Number of articles Mean number of citations of
non-reproduced articles
Mean number of citations of
reproduced articles
>20 21 248 (range 3–800) 231 (range 82–519)
5–19 32 169 (range 6–1,909) 13 (range 3–24)

• SARS-Cov-2 “viral loads in the very young do not differ significantly from
those of adults. Based on these results, we have to caution against an
unlimited re-opening of schools and kindergartens in the present situation”
• Ill-defined research question, comparison between all age groups (45
comparisons), test as if non-ordered categories.
• Reanalyzes with more appropriate techniques finds opposite conclusion.
• https://medium.com/@d_spiegel/is-sars-cov-2-viral-load-lower-in-young-children-than-adults-
8b4116d28353

A mistake in the operating room can
threaten the life of one patient;
A mistake in the statistical analysis or
interpretation can lead to hundreds of early
deaths.
Andrew Vickers, Biostatistician, Memorial Sloan Kettering Cancer Center

Some reactions to previous presentations
MDs
- surprise that meta-analysis can be biased
- “Our statisticians did not tell us this”

Altman BMJ 1994,
republished almost unchanged 20 years later…
“Put simply, much poor research arises
because researchers feel compelled for
career reasons to carry out research that
they are ill equipped to perform, and
nobody stops them.”

Statisticians
- “Oh no not this again”
- “We know this already”
- “P-values are not the problem”
- “It’s not us, it’s them”

An ethical statistician…
identifies and mitigates any preferences on the part of the investigators or data providers that might
predetermine or influence the analyses/results
only support studies that have pre-defined objectives and that are capable of producing useful results
strives to explain any expected adverse consequences of failure to follow through on an agreed-upon
sampling or analytic plan
shall indicate the risks and possible consequences if their professional judgement is overruled
Views or opinions based on general knowledge or belief should be clearly distinguished from views or
opinions derived from the statistical analyses being reported.
Taken from RSS code and ASA ethical guidelines

recognizes […] research practices and standards can differ across disciplines, and statisticians do not have
obligations to standards of other professions that conflict with these guidelines
shall take personal responsibility for work bearing their name
avoids compromising scientific validity for expediency
should always be aware of their overriding responsibility to the public good […] A Fellow’s obligations to
employers, clients and the profession can never override this; and Fellows should seek to avoid situations
and not enter into undertakings which compromise this responsibility

conveys the findings in ways that are both honest and meaningful to the user/reader
shall seek to conform to recognised good practice including quality standards which are in their judgement
relevant, and shall encourage others to do likewise
shall seek to advance knowledge and understanding of statistical science and advocate its use. This
advocacy of statistical science should extend to employers, clients, colleagues and the general public

What can we do better?
ATOM
- Accept uncertainty (no more ***, interpret confidence intervals)
- Be Thoughtful (research question, design, clinically relevant effect size, registered reports)
- Be Open (conflicts of interest, registration, share data, code, analysis protocols, publish all results)
- Be Modest (exploratory, retrospective, secondary analyses (no harking); interpret studies in broader context)
Ronald L. Wasserstein, Allen L. Schirm & Nicole A. Lazar (2019) Moving to a World Beyond “p < 0.05”, The American Statistician, 73:sup1, 1-19

Distinguish between different applications
E.g. Cox 2020
- Two-decision situation (health screening: return tomorrow vs next year; control error)
- Subject-matter hypothesis (difference in trt, H0: there is no difference; p-value as measure of uncertainty)
- Dividing hypothesis (at which level does CI only contain positive/negative effects?)
- Tests of model adequacy(normality assumption, informal role, judgement required)

• “How to” teaching vs understanding
Simulated data (Bishop 2020)
• Be explicit about how principles extend to observational
research
• Software
• Conceptual clarity in educational material (Greenland 2019)
“Significance level”: α or p-value?
“P-value”: observed value p or random variable P?

It won’t be easy
• Misconception fatigue in teaching

It won’t be easy
Wang et al Annals of Internal Medicine 2018

No statistician can do this alone
• A responsibility for each of us
• A role for professional organizations
• A necessity to put this on the agenda of ISCB
• Thanks to John Carlin and Jonathan Sterne - even if
there are no free croissants

Testimation bias
Steyerberg et al JCE 1999

Replicability without p-values?
• Hanson (1958) – anthropology, sociology, psychology
- Statements with “confirmation criteria” (test): >70% confirmed
- Statements without confirmation criteria: <46% confirmed
• Basic Applied Social Psychology banned p-values
- Overinterpret / overstate descriptive results
Lakens et al 2020

Dichotomania and other challenges for the collaborating biostatistician

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Dichotomania and other challenges for the collaborating biostatistician

Similar to Dichotomania and other challenges for the collaborating biostatistician (20)

Recently uploaded

Recently uploaded (20)

Dichotomania and other challenges for the collaborating biostatistician