2. CONTENTS
Introduction
ERROR
Types of error
Random error
Type I & Type II error
Systematic error
Bias
Types of bias
Confounding
What to look for in observational studies?
5. ERROR
Is considered as the difference between the unknown
correct effect
measure value and the study’s observed effect
measure value.
TYPES OF ERROR:
Random error/Non-differential: use of invalid outcome
measure that equally misclassifies cases and controls
Systematic error/Differential: use of an invalid measure
that misclassifies cases in one direction and misclassifies
controls in another
6. 14
12
10
8
6
4
2
0
RANDOM ERROR
0 5 10 15 20 25 30 35
X
Y
With random
error
Without random
error
Random error doesn’t affect the average, only the variability
around the average
7. 14
12
10
8
6
4
2
0
SYSTEMATIC ERROR
With systematic
error
Without systematic
error
0 5 10 15 20 25 30
Systematic error does affect the average, called as bias
X
Y
9. What can be wrong in the study?
RANDOM ERROR
(=CHANCE)
Results in low precision of
the epidemiological
measure measure is not
precise, but true
1. Imprecise measuring
2. Too small groups
Decreases with increasing
group size & repeating
test.
Can be quantified by
confidence interval
SYSTEMATIC
ERRORS
(= BIAS)
Results in low
validity(internal &
external) of the
epidemiological measure
measure is not true
1. Selection bias
2. Information bias
3.Confounding
Does not decrease with
increasing sample size or
15. RANDOM ERROR
TYPE II ERROR
(PROBABILITY=β)
CORRECT
DECISION
(PROBABLITY=1-
β)
POWER OF
STUDY
TREATMENTS
NOT
DIFFERENT
CORRECT
DECISION
TYPE I ERROR
(PROBABILITY=α)
TREATMENTS
ARE
DIFFERENT
CONCLUDE
TREATMENTS
NOT
DIFFERENT
CONCLUDE
TREATMENTS
ARE
DIFFERENT
REALITY
DECISION
17. REDUCING RANDOM ERROR
Reducing the Risk of Type I Errors:
Lower (p<0.05)
Repeat the study
Reducing the Risk of Type 2 Errors:
Providing adequate sample size, and
Hypothesizing large differences
18.
19. BIAS
DEFINITION:
Any systematic error in the design,
conduct or analysis of a study that results
in a mistaken estimate of an exposure’s
effect on the risk of disease.
20. DIRECTION OF BIAS
Positive bias – observed effect is higher than the true value
(causal effect)
Negative bias – observed effect is lower than the true
value (causal effect)
A BETTER APPROACH IS:
Bias towards the null – observed value is closer to 1.0
than is the true value (causal effect)*
Bias away from the null – observed value is farther from
1.0 than is the true value (causal effect)*
*Note: 1 is the null value for ratio measures (e.g. OR, RR)
21. CLASSIFICATION ACCORDING TO
STAGES OF RESEARCH
Bias is a result of an error anywhere in the
study
Literature Review
Study Design
Study Execution
Data Collection
Analysis
Interpretation of Results
Publication
22.
23.
24. SELECTION BIAS
If the way in which cases and controls, or exposed and non-exposed
individuals, were selected is such that an apparent
association is observed—even if, in reality, exposure and
disease are not associated—the apparent association is the
result of selection bias.
Results from:
Self selection (volunteering)
Nonresponse (refusal)
Loss to follow-up (attrition, migration)
Selective survival
Health care utilization patterns
Systematic errors in detection and diagnosis of health conditions
Choice of an inappropriate comparison group (investigator
25. SELECTION BIAS
SELF-SELECTION BIAS
PUBLICITY BIAS:
People referring themselves to investigators following publicity
about the study.
Considered a threat to validity.
For example: study of leukemia among troops present at the
Smoky Atomic Test in Nevada, 18% of participants contacted
the investigators after publicity, and leukemia may have been
over-represented in these people(had an axe to grind)
HEALTHY WORKER EFFECT:
Occurs before subjects are identified into study
Relatively healthy people become or remain workers
26. SELECTION BIAS
DIAGNOSTIC BIAS/WORK-UP BIAS:
Occurs before the subjects are identified for study
Diagnosis may be influenced by physician’s knowledge of
exposure
For example: A case-control study: for relationship between
DVT and OCPs: general practitioners knew about the possible
link between the two…. Could lead to over-estimation of the
effect of OCPs on DVT
HOSPITAL ADMISSION OR BERKSON’S BIAS:
Occurs when the combination of exposure and disease under
study increases the risk of hospital admission, thus leading to a
higher exposure rate among the hospital cases than the
27. SELECTION BIAS
PREVALENCE-INCIDENCE BIAS:
When prevalent cases are used to study exposure-disease
relationships
Related to the phenomena:
Once a person is diagnosed with a disease, they may
change the habit that contributed to the disease.
Prevalent cases represent survivors of the condition
being studied and as survivors may be atypical with
respect to exposure status they may misrepresent
effects. (Selective survival/Neyman’s bias)
28. SELECTION BIAS
EXCLUSION BIAS:
If the exclusion criteria are different for cases and
controls or different for the exposed and non-exposed
A case–control hospital-based study: to find association
between breast cancer & reserpine….. women who had
medical conditions that would lead to the prescribed use
of reserpine were excluded from the control group….
Leading to overestimation of the association between
breast cancer and reserpine
29. SELECTION BIAS
In CASE-CONTROL STUDIES: Potential Bias: due to poor
choice of controls
CASES CONTROL
SELECTION
Colorectal cancer
patients admitted to
hospital
Patients admitted
to hospital with
arthritis
Colorectal cancer
patients admitted to
hospital
Patients admitted
to hospital with
peptic ulcers
In COHORT STUDY:
NON-REPRESENTATIV
ENESS
Controls probably
have high degrees
of exposure to
NSAIDS
Controls probably
have low degrees
of exposure to
NSAIDS
Differential loss to follow-up….. Differential Attrition
SELECTION BIAS
Would spuriously
reduce the
estimate of effect
Would spuriously
increase the
estimate of effect
Subjects in follow-up study of multiple sclerosis may differentially drop out
due to disease severity
30. SELECTION BIAS
NON-RESPONSE BIAS:
In a prevalence study of asthma, chronic bronchitis, and
respiratory symptoms, the characteristics of non-responders
and the reasons for non-response were studied.
Data were obtained by a mailed questionnaire.
Non-responders were contacted by telephone and interviewed
using the same questionnaire.
Found a significantly higher proportion of current smokers and
manual labourers among the non-responders than among the
responders.
Prevalence rates of wheezing, chronic cough, sputum
production, attacks of breathlessness, and asthma and use of
asthma medications were significantly higher among the non-responders
than among the responders.
Ronmark et al,
31. CONTROLLING SELECTION BIAS
Develop an explicit (objective) case definition.
Enroll all cases in a defined time and region.
Strive for high participation rates.
Take precautions to ensure representativeness.
AMONG CASES:
Ensure that all medical facilities are thoroughly canvassed.
Develop an effective system for case ascertainment.
AMONG CONTROLS:
Compare the prevalence of the exposure with other sources
to evaluate credibility.
Attempt to draw controls from a variety of sources.
32.
33. INFORMATION BIAS
When the means for obtaining information about the subjects
in the study are inadequate so that as a result some of the
information gathered regarding exposures and/or disease
outcome is incorrect, Information bias can occur.
Some sources of information bias are:
Subject variation
Observer variation
Deficiency of tools
Technical errors in measurement
34. INFORMATION BIAS
MISCLASSIFICATION BIAS:
Due to inaccuracies in methods of data acquisition, the
subjects, at times, may be misclassified.
For example,
In a case-control study, cases may be misclassified as
controls, and vice versa, due to
the limited sensitivity and specificity of the diagnostic tests or
from inadequacy of information derived from medical or other
records.
Person’s exposure status may be misclassified
35. INFORMATION BIAS
MISCLASSIFICATION BIAS:
Two forms:
Differential: If misclassification of exposure (or disease) is related
to disease (or exposure)
Women who had a baby with a malformation tend to remember
more mild infections that occurred during their pregnancies than
mothers of normal infants.
Non-differential: If misclassification of exposure (or disease) is
unrelated to disease (or exposure)
By mistake, some diseased persons are included in control
group and some non-diseased persons in case
group(misclassified in regard to diagnosis).
As a result, a smaller difference in exposure will be found
between our cases and our controls than actually exists between
36. TYPES OF INFORMATION BIAS
Recall bias
Reporting bias
Bias in abstracting records
Bias in interviewing
Bias from surrogate interviews
Surveillance bias
37. INFORMATION BIAS
Recall bias:
Those exposed have a greater sensitivity for recalling
exposure (reduced specificity)
Specifically important in case-control studies- when
exposure history is obtained retrospectively
cases may more closely scrutinize their past history looking for ways
to explain their illness
controls, not feeling a burden of disease, may less closely examine
their past history
Those who develop a cold are more likely to identify the
exposure than those who do not – differential misclassification
Case: Yes, I was sneezed on
Control: No, can’t remember any sneezing
38. INFORMATION BIAS
Reporting bias:
Individuals with severe disease tends to have complete records
therefore more complete information about exposures and greater
association found
Individuals who are aware of being participants of a study behave
differently (Hawthorne effect)
Wish bias:
Bias introduced by subjects who have developed a disease and
who in attempting to answer the question “Why me?” seek to show,
often unintentionally, that the disease is not their fault.
May deny certain exposures related to lifestyle (such as smoking or
drinking); if contemplating litigation, may overemphasize
workplace-related exposures.
Can be considered one type of reporting bias.
39. INFORMATION BIAS
Surveillance bias:
If a population is monitored over a period of time, disease
ascertainment may be better in the monitored population than
in the general population
Leads to an erroneous estimate of the relative risk or odds
ratio
Surrogate interviews:
Obtaining information from person other than subject.
E.g., in case of diseases with high case-fatality rate
40. CONTROLLING INFORMATION BIAS
Blinding
prevents investigators and interviewers from knowing case/control or
exposed/non-exposed status of a given participant
Form of survey
mail may impose less “white coat tension” than a phone or face-to-face
interview
Questionnaire
use multiple questions that ask same information
acts as a built in double-check
Accuracy
multiple checks in medical records
gathering diagnosis data from multiple sources
41. PUBLICATION BIAS OR NON-PUBLICATION
BIAS
Occurs because of the influence of study results
on the chance of publication.
Studies with positive results are more likely to be
published than studies with negative results.
May result in a preponderance of false-positive
results in the literature.
Bias is compounded when published studies are
subjected to meta-analysis.
42.
43. CONFOUNDING
“a confusion of effects”
Defined as:
a situation in which the measure of effect of
exposure on disease is distorted because of the
association of the study factor with other factors that
influence the outcome. These other factors are
called confounders.
44. CONFOUNDER
In a study of whether factor A is a cause of disease
B, a third factor, factor X, is a confounder if the
following are true:
1. Factor X is a known risk factor for disease B.
2. Factor X is associated with factor A, but is not a
result of factor A.
45. EXAMPLE OF CONFOUNDING
CAUSAL CONFOUN
DING
PANCREATIC
CANCER
PANCREATIC
CANCER
Coffee
Drinking
Coffee
Drinking
SMOKING
OBSERVED
ASSOCIATION
OBSERVED
ASSOCIATION
46. Cases of Down syndroms by birth order
180
160
140
120
100
80
60
40
20
0
EXAMPLE OF CONFOUNDING
1 2 3 4 5
Birth order
Cases per 100 000
live births
Cases of Down Syndrome by Birth Order
47. EXAMPLE OF CONFOUNDING
Cases of Down Syndrom by age groups
1000
900
800
700
600
500
400
300
200
100
0
< 20 20-24 25-29 30-34 35-39 40+
Age groups
Cases per
100000 live
births
Cases of Down Syndrome by Age Groups
48. EXAMPLE OF CONFOUNDING
Birth Order Down Syndrome
Maternal Age
Maternal age is correlated with birth
order and a risk factor even if birth
order is low
49. EXAMPLE OF CONFOUNDING
Maternal Age Down Syndrome
Birth Order
Birth order is correlated with maternal
age but not a risk factor in younger
mothers
50. Cases per 100000
1000
900
800
700
600
500
400
300
200
100
0
CONFOUNDING
1 2 3 4 5
< 20
25-29
20-24
35-39
30-34
40+
Birth order
Age groups
Cases of Down syndrom
by birth order and mother's age
Cases of Down Syndrome by Birth Order and Maternal Age
If each case is matched with a same-age control, there will be
no association. If analysis is repeated after stratification by
age, there will be no association with birth order.
51. CONTROL OF CONFOUNDING
Control at the design stage
Randomization: of subjects to study groups to attempt
to even out unknown confounders
Restriction: of subjects according to potential
confounders (i.e. simply don’t include confounder in
study)
Matching: subjects on potential confounder thus
assuring even distribution among study groups
52. CONTROL OF CONFOUNDING
Control at the analysis stage
Conventional approaches
Stratified analyses
Multivariate analyses
Newer approaches
Graphical approaches using Directed acyclic
graph(DAGs)
Propensity scores
Instrumental variables
Marginal structural models
53.
54. What to look for in observational studies?
Is the selection bias present?
In a cohort study, are participants in the exposed and
unexposed groups similar in all important respects except
for the exposure?
In a case-control study, are cases and controls similar in all
important respects except for the disease in question?
Is the information bias present?
In a cohort study, is information about outcome obtained in
the same way for those exposed and unexposed?
In a case-control study, is information about exposure
gathered in the same way for cases and controls?
55. What to look for in observational studies?
Is confounding present?
Could the results be accounted for by the presence of a
factor – e.g., age, smoking, diet, -- associated with both
the exposure and the outcome but not directly involved
in the causal pathway?
If the results cannot be explained by these three
biases, could they be the result of chance?
What are the relative risk or odds ratio and
95%Confidence Interval?
Is the difference statistically significant, and, if not, did
the study have adequate power to find a clinically
important difference?
56. What to look for in observational studies?
If the results still cannot be explained, then
(and only then) might the findings be real and
worthy of note?
57. IDEAL GROUP COMPARISON MODEL
Factors affecting the Dependent Variable
140
120
100
80
60
40
20
0
Control Group Experimental Group
Effect
Independent Variable
Confounder(s) - others
Confounder: Placebo
effect
Confounder: Hawthorne
effect
Natural history
68. PUBLICATION
All's well literature bias
Positive result bias
Hot topic bias
69.
70. LEAD TIME BIAS
Overestimation of survival duration among screen
detected cases when survival is measured from
diagnosis.
71.
72. LENGTH TIME BIAS
Overestimation of survival duration among screen-detected
cases due to the relative excess of slowly
progressing cases.
These are disproportionally identified by screening
because the probability of detection is directly
proportional to the length of time during which they
are detectable.
73.
74. OVER DIAGNOSIS BIAS
Over diagnosis occurs when all of these people with
harmless abnormalities are counted as "lives saved"
by the screening, rather than as "healthy people
needlessly harmed by over diagnosis".
Screening may identify abnormalities that would
never cause a problem in a person's lifetime. For
example, prostate cancer screening; it has been
said that "more men die with prostate cancer than of
it".
Issues unnecessary treatment.
75. Potential Role of Chance in Affecting the Effect:
Meaning of Statistical Significance
77 Factors affecting the Dependent Variable
140
120
100
80
60
40
20
0
Control Group Experimental Group
Effect
Independent Variable
Confounder(s) - others
Confounder: Placebo
effect
Confounder: Hawthorne
effect
Natural history
p<
p>
76. Flawed Model
Control groFuacpto rrse acffeecivtinegs t hteh Dee pinenddeenpt eVanrdiabelent variable.
120
100
80
60
40
20
0
Control Group Experimental Group
Effect
Independent Variable
Confounder(s) - others
Confounder: Placebo
effect
Confounder: Hawthorne
effect
Natural history
77. Flawed Model
Unbalanced confounding variables
Factors affecting the Dependent Variable
S. Wetstone
90
80
70
60
50
40
30
20
10
0
Control Group Experimental Group
Effect
Independent Variable
Confounder(s) - others
Confounder: Placebo
effect
Confounder: Hawthorne
effect
Natural history