Principles of epidemiology

PRINCIPLES OF
EPIDEMIOLOGY
Dr. Anshu Mittal
Professor
Department of Community Medicine
MM Institute of Medical Sciences and Research,
Mullana, Ambala

1. Introduction to Epidemiology

Definitions
Health: A state of complete physical, mental and social well-
being and not merely the absence of disease or
infirmity (WHO,1948)
Disease: A physiological or psychological dysfunction
Illness: A subjective state of not being well
Sickness: A state of social dysfunction

Definitions…
Public health
The science & art of
Preventing disease,
prolonging life,
promoting health & efficiency
through organized community effort (Winslow, 1920)

Introduction
• The term epidemiology is derived from the Greek word
epidemic.
– Epi means-Among, upon,
– Demos means study population or people and
– Logos means scientific study.
• So
– it is the scientific study of the disease pattern in human
population.
– In broad sense, it is the study of effects of multiple factors on
human health.
– It is multidisciplinary subject involving those of the physician,
Biologists, Public Health experts, Health educators etc.

Definitions
• The science of infective diseases, their prime causes,
propagation and prevention. (Stallbrass 1931.)
• The science of the mass phenomena of infectious
diseases or the natural history of infectious diseases.
(Frost 1927)

Definitions
• The study of the disease, any diseases, as a mass
phenomenon. (Greenwood 1935)
• The study of condition known or reasonably supposed
to influence the prevalence of disease. (Lumsden 1936)
• Epidemiology as, study of the distribution and
determinants of diseases frequency in man. (Mac
Mohan and Pugh)

The widely accepted definition of
epidemiology is,
• "The study of the distribution and determinants
of health related states or events in specified
population and the application of the study to
control of health problems“
(J.M. Last 1988)

Components of the definition
1.Study: Systematic collection, analysis and
interpretation of data
Epidemiology involves collection, analysis and
interpretation of health related data
Epidemiology is a science

Components…
2. Frequency: the number of times an event occurs
Epidemiology studies the number of times a disease
occurs
It answers the question How many?
Epidemiology is a quantitative science

Components…
3. Distribution: Distribution of an event by person,
place and time
Epidemiology studies distribution of diseases
It answers the question who, where and when?
Epidemiology describes health events

Components…
4. Determinants: Factors the presence/absence of
which affect the occurrence and level of an event
Epidemiology studies what determines health events
It answers the question how and why?
Epidemiology analyzes health events

Components…
5. Diseases & other health related events
Epidemiology is not only the study of diseases
The focus of Epidemiology are not only patients
It studies all health related conditions
Epidemiology is a broader science

Components…
6. Human population
Epidemiology diagnoses and treats
communities/populations
Clinical medicine diagnoses and treats patients
Epidemiology is a basic science of public health

Components…
7. Application
Epidemiological studies have direct and practical
applications for prevention of diseases &
promotion of health
Epidemiology is a science and practice
Epidemiology is an applied science

The ultimate aims of epidemiology can be concluded in
to two followings points.
• To eliminate or reduce the health problem or its
consequences and
• To promote the health and wellbeing of society
as a whole.

History of epidemiology
• The history of epidemiology has its origin in the idea, goes back
to (400BC) Hippocrates through John Graunt (1662), William
Farr, John Snow and others that environmental factors can
influences the occurrences of diseases in stead of supernatural
viewpoint of diseases.
• John Graunt analysed and published the mortality data in
1662.He was the first quantify pattern of death, birth and
diseases occurances.
• No one built upon Graunt’s work until 1800’s.when William Farr
began to systematically collect and analyst the Britain’s mortality
statistics. Farr considered as the father of vital statistics and
diseases classifications.

• Meanwhile John Snow was conducting the series of
investigations in London that later earned him the title father of
field epidemiology. Snow conducted his classical study in 1854
when an epidemic of cholera developed in the golden square of
London. During the time of microscope development, snow
conducted studies of cholera outbreak both to discover the
causes of diseases and prevent its recurrences.
• During that time two men (Farr and snow) had major
disagreement about the cause of cholera. Farr adhere to what
was the called miasmatic theory of diseases, according to this
theory which was commonly held at a time diseases was
transmitted by a miasma or cloud that clung low on the earth
surface.

• However Snow did not agree he believed that cholera is
transmitted through contaminated water. He began his
investigation by determining where in this area person
with cholera lived and worked. He then used this
information to map for distribution of diseases. Snow
believed that water was the source of infection for
cholera. He marked the location and searches the
relationship between cases and water sources. He found
that cholera was transmitted though contaminated
water. This was the major achievement in
epidemiology.

• In the 1900s epidemiologists extend their methods to
noninfectious diseases and studied effect of behaviors and life
style in human health. There are some important achievements in
epidemiology they are;
– John Snow and cholera epidemic in London in 1848-1854.
– Framingham heart study started in 1950 in Massachusetts, USA and still
continuing to identify the factors leading to the development of the
coronary heart diseases.
– Smoking and lung cancer by Doll and Hill in 1964.
– Polio Salk vaccine field trial in 1954 to study the protective efficacy of
vaccine in a million school children.
– Methyl Mercury poisoning 1950s In Minamata

History…
Epidemiological thought emerged in 460 BC
Epidemiology flourished as a discipline in 1940s

Scope of Epidemiology
Originally, Epidemiology was concerned with
investigation & management of epidemics of
communicable diseases
Lately, Epidemiology was extended to endemic
communicable diseases and non-communicable
infectious diseases
Recently, Epidemiology can be applied to all
diseases and other health related events

Purpose/use of Epidemiology
The ultimate purpose of Epidemiology is prevention
of diseases and promotion of health
How?
1. Elucidation of natural history of diseases
2.Description of health status of population
3. Establishing determinants of diseases
4. Evaluation of intervention effectiveness

Uses of epidemiology
• Investigation of causation of disease.
Genetic Factors
Good Health Ill Health
Environmental Factors

• Study of the natural history and prognosis of
diseases.
Good health Sub clinical Changes
Clinical DiseasesClinical Diseases
DeathDeath
RecoveryRecovery

• Description of the health status of the
populations. It includes proportion with ill
Health, change over time, change with age etc.
• Evaluation of the interventions.
• Planning health services, Public policy and
programs.

And, Recently
• epidemiologists have become involved in
evaluation the effectiveness and efficacy of
health services, by determining the appropriate
length of stay in hospital for specific conditions,
the value of treating High blood pressure, the
efficiency of sanitation measures to control
diarrhoeal diseases, the impact on public health
of reducing lead activities in petrol etc.

Field of epidemiology
Epidemiology covers the various types of field in different
types of activities. It is applied in every field as agricultural,
economics, statistics etc. They are as
• Clinical epidemiology
• Geographical epidemiology
• Social epidemiology
• Statistical epidemiology
• Descriptive epidemiology
• Analytical epidemiology
• Experimental epidemiology
• Infectious diseases epidemiology etc.

Concept of disease causation
• Germ theory of diseases
• Epidemiological triads
• Multifactorial causation
• Web of causation

Germ Theory of Disease
• Infection leads to disease

Epidemiological triads
• Agent -Biological, chemical, physical, nutritional, Social
• Host factor- Age, sex, heredity, nutrition, Occupation, Custom,
habits, Immunity power, Biological-Blood sugar, Cholesterol, Housing,
Marital status, socio-economic status
• Environmental Factor- Physical, Biological, Psychosocial

Example – Typhoid Fever
Disease

SCOPE OF MEASUREMENTS IN EPIDEMIOLOGY

Measurements in Epidemiology
1. Measurement of mortality.
2. Measurement of morbidity.
3. Measurement of disability.
4. Measurement of natality.
5. Measurement of presence or absence of attributes.
6. Measurement of health care need.
7. Measurement of environmental & other risk factors.
8. Measurement of demographic variables.

Numerator and Denominator
• Numerator – Number of events in a population
during specified time.
• Denominator -
1.Total population
- Mid-year population
- Population at risk
2. Total events

Tools of Measurements
Basic tools are -
• 1. Rate
• 2. Ratio
• 3. Proportion
• Used for expression of disease magnitude.

Rate
• A “Rate” measures the occurrence of some specific
event in a population during given time period.
• Example –
Death Rate = total no of death in 1 yr / Mid-year population
x 1000.
ELEMENTS –
Numerator, Denominator, time & multiplier.

Ratio
• Ratio measures the relationship of size of two random
quantities.
• Numerator is not component of denominator.
• Ratio = x / y
• Example-
- Sex – Ratio
- Doctor Population Ratio.

Proportion
• Proportion is ratio which indicates the relation
in a magnitude of a part of whole.
• The Numerator is always part of Denominator.
• Usually expressed in percentage.

Mortality rates
These rates measures magnitude of deaths in a
community
Some are crude like the crude death rate
Others are cause-specific mortality rate
Some others are adjusted like standardized
mortality ration

Common Mortality rates
• Crude death rate
• Age-specific mortality rate
• Sex-specific mortality rate
• Cause-specific mortality rate
• Proportionate mortality ratio
• Case fatality rate
• Fetal death rate
• Perinatal mortality rate
• Neonatal mortality rate
• Infant mortality rate
• Child mortality rate
• Under-five mortality
rate
• Maternal mortality ratio

Incidence and Prevalence
• These are fundamentally different ways of
measuring disease frequency.
• The incidence of disease represents the rate of
occurrence of new cases arising in a given period in a
specified population, while
• prevalence is the number of existing cases (old+
new) in a defined population at a given point in time.

Incidence
• “Number of new cases occurring in defined
population during specified period of time”
• Incidence = Number of new cases during
given period / Population at risk x 1000

Prevalence
• Prevalence is total no of existing cases ( old + new)
in a defined population at a particular point in time or
specified period.
• Prevalence = Total no of cases at given point of time
/ Estimated population at time x 100

Relation between Incidence & Prevalence
Prevalence = Incidence x Mean duration of d/se.
P = I x D
Example – if,
I= 10 cases per 1000 per year.
D = 5 years.
P = 10 x 5
50 cases per 1000 population.

• 1. Point Prevalence
Prevalence for given point of time.
• 2. Period Prevalence
Prevalence for specified period.

Relation between Incidence & Prevalence

Occurrence of cases of disease

Practical challenges in measuring incidence
rate
1. Identification of population at risk
Population at risk constitutes all those free of
the disease and susceptible to it
2. Population is not static/it fluctuates/as a result
of births, deaths and migration
3. People are at risk only until they get the disease
and then no more at risk

Practical solution to the challenges
1. Use the total population as a denominator
This gives an estimate of the incidence rate and not
the actual incidence rate
2. Use person-time at risk
Incidence density=number of new cases of a
disease over a specified period/person-time at risk

Factors influencing the prevalence

Study design
Study design is the arrangement of conditions for
the collection and analysis of data to provide the
most accurate answer to a question in the most
economical way.

Types of Epidemiologic study designs
I. Based on objective/focus/research question
1. Descriptive studies
– Describe: who, when, where & how many
2. Analytic studies
– Analyse: How and why

Types…
II. Based on the role of the investigator
1. Observational studies
– The investigator observes nature
– No intervention
2. Intervention/Experimental studies
– Investigator intervenes
– He has a control over the situation

Types…
III. Based on timing
1. One-time (one-spot) studies
– Conducted at a point in time
– An individual is observed at once
2. Longitudinal (Follow-up) studies
– Conducted in a period of time
– Individuals are followed over a period of time

Types…
IV. Based on direction of follow-up/data collection
1. Prospective
– Conducted forward in time
2. Retrospective
– Conducted backward in time

Types…
V. Based on type of data they generate
1. Qualitative studies
– Generate contextual data
– Also called exploratory studies
2. Quantitative studies
– Generate numerical data
– Also called explanatory studies

Types…
VI. Based on study setting
1. Community-based studies
– Conducted in communities
2. Institution-based studies
– Conducted in communities
3. Laboratory-based studies
– Conducted in major laboratories

study design in Epidemiology
• Observational Study
– Descriptive studies
– Analytical Studies
• Ecological Study: - Correlation Study unit is a population.
• Cross-Sectional Study: - prevalent Study Individual is a unit of study.
• Case-Control Study: - case-reference with individual is a unit of study.
• Cohort study:-Follow up study with individual is a unit of study.
• Experimental Studies
– Randomized Control Trials
– Field Trials
– Community Trials

Types…
VII. Standard classification
1. Cross-sectional studies
2. Case-control studies
3. Cohort studies
4. Experimental studies

Epidemiology Study Types
Epidemiology
study
types
Experimental
Observational
Descriptive
Analytic
67

Measurement of Disease
• In terms of INCIDENCE or PREVALENCE
• Methods: Study Designs
• Prevalence: CROSS-SECTIONAL Studies
• Incidence: LONGITUDINAL Studies

CROSS- SECTIONAL STUDIES
• Also known as prevalence studies
• Single examination of participants is there
• Association among variables is suggested
but….
• No CAUSAL ASSOCIATION can be established
• Only helps to reach certain hypothesis or
assumption which needs to be tested or
confirmed by analytical studies i.e.
longitudinal design.
• Less time consuming, easy to conduct and no
point of loss to follow up

Cross-sectional…
Limitations of cross-sectional studies
• Antecedent-consequence uncertainty
“Chicken or egg dilemma”
• Data dredging leading to inappropriate comparison
• More vulnerable to bias

Cross-sectional…
Types of cross-sectional studies
1. Single cross-sectional studies
– Determine single proportion/mean in a single
population at a single point in time
2. Comparative cross-sectional studies
– Determine two proportions/means in two populations at
a single point in time
3. Time-series cross-sectional studies
– Determine a single proportion/mean in a single
population at multiple points in time

LONGITUDINAL STUDIES
• Participants undergo repeated examinations,
i.e. information from each participant is
collected multiple times
• Can be PROSPECTIVE OR RETROSPECTIVE
• Help to study the natural history of the
disease and its future outcome
• Establishing the risk factors of the disease
• Helps to find out the incidence
• More time consuming and loss to follow up is
unavoidable.

LONGITUDINAL vs CROSS-SECTIONAL
E
X
P
O
S
U
R
E
OUT
COME
PROSPECTIVE STUDY
RETROSPECTIVE STUDY
CROSS-SECTIONAL STUDY
T I M E L I N E

Epidemiology Study Types
Epidemiology
study
types
Experimental
Observational
Descriptive
Analytic
74

Analytical epidemiology
 Second major type of epidemiology.
 Focus on individual within population unlike descriptive
epidemiology..
 Objective not to formulate hypothesis but to test
hypothesis.
 Second major type of epidemiology.
 Focus on individual within population unlike descriptive
epidemiology..
 Objective not to formulate hypothesis but to test
hypothesis.
TYPES
A.CASE CONTROL STUDY
B.COHORT STUDY
TYPES
A.CASE CONTROL STUDY
B.COHORT STUDY

• Retrospective study
• Distinct features:
1.Both exposure and outcome have occurred
before the start of disease
2.Study proceed backward from effect to
cause
3.Uses a control or comparison group to
support or refute an inference.
CASE CONTROL STUDY

Design of a Case Control Study
Time
Direction of Inquiry
Exposed
Not Exposed
Exposed
Not Exposed
Cases (with disease)
Controls (without
disease)
Population

The basic study design
Control
(those without condition)
eg: those free of oral cancer
Cases
(those with condition)
eg: cases with oral cancer
Unexposed (without characteristic or
risk factor)
Eg. Non chewers
Exposed (with characteristic or risk
factor)
Eg. tobacoo chewers

1. Selection of cases and controls
2. Matching
3. Measurement of exposure , and
4. Analysis and interpretation.
BASIC STEPS

Selection of cases and controls

Selection of controls
i. COMPARABLE : the controls should be similar to the cases in all
respects other than having the disease .
ii. REPRESENTATIVE : the controls should be representative of all
non-diseased people in the population from which the cases
are selected.
iii. Sources of controls
• General population
• Relatives/Friends/Neighbours
• Hospital controls
iv. Number
i. Large study: Cases: Control : 1:1
ii. Small study: Cases: Control : 1:2, 1:3, 1:4.

Sources of controls
Source Advantage Disadvantage
Hospital based • Easily identified.
• Available for interview.
• More willing to
cooperate.
• Tend to give complete
and accurate
information (↓recall
bias).
•Not typical of general population.
•Possess more risk factors for disease.
•Some diseases may share risk factors
with disease under study. (whom to
exclude???)
•Berkesonian bias
Population based
(registry cases)
•Most representative of the
general population.
•Generally healthy.
•Time, money, energy.
•Opportunity of exposure may not be
same as that of cases. (locn
, occu,)
Neighbourhood
controls/ Telephone
exchange random
dialing
•Controls and cases similar
in residence.
•Easier than sampling the
population.
•Non cooperation.
•Security issues.
•Not representative of general
population.
Best friend control/
Sibling control
•Accessible, Cooperative.
•Similar to cases in most
aspects.
•Overmatching.

 Define as:”process by which we select controls in such a way that they are
similar to cases with regard to certain pertinent selected variables(eg.
Age) which are known to influence the outcome of disease and which, if
not adequately matched for comparability, could distort or confounded
the result”.
 CONFOUNDING FACTOR
2. MATCHING
EXPOSURE
(eg. Consumption of
alcohol)
DISEASE
(eg. Oesophageal
cancer)
CONFOUNDING FACTOR (eg.
smoking, age)
CONFOUNDING FACTOR (eg.
smoking, age)

Matching
• Matching is defined as the process of selecting
controls so that they are similar to cases in certain
characteristics such as age, sex, race,
socioeconomic status and occupation.
(Epidemiology; Leon Gordis, 2004)
• Matching variables (e.g. age), and matching criteria
(e.g. within the same 5 year age group) must be set
up in advance.

Types of matching
Controls can be individually matched (most common) or
Frequency matched.
1.Individual matching (Matched pairs): search for
one (or more) controls who have the required
matching criteria, paired (triplet) matching is when
there is one (two) control (s) individually matched to
each cases.
2. Group matching (Frequency matching): select a
population of controls such that the overall
characteristics of the case, e.g. if 15% cases are
under age 20, 15% of the controls must be also
under age 20. another example If 30% of cases are
males of Hindu religion in 60-65 years then we take
30% of similar controls

 Definition and criteria about exposure are just as important as
those used to define cases and controls. This may be obtained
by :
 Interviews
 Questionnaires
 Studying past record of cases such as hospital records,
employment records etc.
 Clinical or laboratory examination.
Investigator should not know whether a subject is in case or
control group.
3.MEASUREMENT OF EXPOSURE AND OTHER FACTORS

The final step is analysis, to find out:
a) Exposure rates among cases and controls to suspected factors
b) Estimation of disease risk associated with exposure (ODD RATIO)
4. ANALYSIS AND INTERPRETATION

ANALYSES AND INTERPRETATION OF CASE CONTROL
STUDY
• On analysis of case control study we find out :-
– Exposure rates: the frequency of exposure to suspected risk factor in
cases and in controls
– Odds ratio : Estimation of disease risk associated with exposure.
– The only valid measure of association for the Case control study is
the Odds Ratio (OR)
– OR = Odds of exposure among cases (disease)
Odds of exposure among controls (non-dis)
• Odds of exposure among cases = a / c
• Odds of exposure among controls = b / d
– Odds ratio: = (a/c)/ (b/d) = ad / bc
– Odds ratio (OR )= 1.0 (implies equal odds of exposure - no effect)

Figure 1 : the relationship between an exposure and
occurrence of disease
Disease
present (+)
Disease
absent (-)
Exposure (+) Expected
diseased
(a)
Unexpectedly
non diseased
(ODD)
(b)
Exposure (-) Unexpectedly
diseased
(ODD)
(c)
Expected non
diseased
(d)

 Exposure rates:
 A case control study provides a direct estimation of the
exposure rates (frequency of exposure) to the suspected
factor in disease and non-disease groups.
Exposure rates
 Cases = a/ (a + c) = 33/ 35 = 94.2%
 Controls = b/ (b + d) = 55/82 = 67.0%
 Odds ratio: = (a/c)/ (b/d) = ad / bc = 33*27/55*2 = 8.1
Cases
(lung cancer)
Controls
(without lung
cancer)
Smokers 33 (a) 55 (b)
Non Smokers 2 (c) 27 (d)
TOTAL 35 (a + c) 82 (b+d)

How to interpret the Odds ratio?
• People who smoke have an 8.1 times higher risk of
developing lung cancer compared to those who do not
smoke.

Another example…..
• Relationship between physical activity and
obese
Exposure rates
 Cases = a/ (a + c) = 2/ 35 = 5.7%
 Controls = b/ (b + d) = 55/82 = 67.0%
• Odds ratio: = (a/c)/ (b/d) = ad / bc = 2*27/33*55 = 0.03
Obese Non Obese
Active 2 (a) 55 (b)
Non- Active 33 (c) 27 (d)
35 82

How to interpret this Odds ratio?
• People who are physically active have 0.03 times risk of being
obese as compared to those who do not indulge in any
physical workout.
• Hence physical activity helps to prevent obesity.

Exercise
• An investigator selected 40 cases of gastric carcinoma
and an equal number of controls matched for age, sex
and socioeconomic status. It was found that among
cases 30 had an evidence of H pylori infection and
among controls 15 had an evidence of H pylori
infection. Is there an evidence of association between
H pylori infection and gastric carcinoma?
1. Draw the two by two table
2. Find exposure rate in cases
3.Find exposure rate in controls
4. Calculate “Odds Ratio”
5. Interprets the results.

Application of case control studies
1. Vaccine effectiveness
2. Evaluation of treatment and program efficacy
3. Evaluation of screening programs
4. Outbreak investigations
5. Demography
6. Genetic epidemiology
7. Occupational epidemiology

Bias in Case control Study
• Bias is any systematic error in the design, conduct, or
analysis of a study that results in mistaken estimates of
the effect of the exposure on disease.
• Types of Bias
• Bias due to Confounding: Matching should be done
• Memory or Recall Bias: Cases remember events better
than controls
• Selection Bias: When participants are not uniformly
distributed in the population
• Berkesonian Bias: Different rate of admissions in different
hospitals
• Interviewer’s Bias: Cases are investigated or questioned
more extensively as compared to controls

ADVANTAGES:
1.Relatively easy to carry out.
2.Rapid and inexpensive
3.Require fewer subjects.
4.Suitable for investigation of rare
diseases.
5.No risk of subject.
6.Allows the study of several
different etiological factors.
7.Risk factor can be identify
8.No attrition problem because do
not require follow up.
9.Minimal ethical problem.
DISADVANTAGES:
1.Problem of bias since it relies
on past memory or past
records.
2.Difficulty in selection of
appropriate control group.
3.Can not measure incidence
4.Doesn’t distinguish between
cause and associated factors.
5.Not suited for the evaluation
of therapy or prophylaxis of
disease.
CASE CONTROL STUDY

Famous Examples
• Adenocarcinoma of Vagina:
• Time clustering of 7 cases among younger females of
15-22 years of age
• Reported between 1966-69
• Rare disease that too affects females more than 50
years age
• Got exposed to diethyl stillbestrol (for prevention of
miscarriage) during foetal life
• 04 controls were taken for each case who were born at
the same time at same hospital

Oral Contraceptives and
thromboembolic disease
• Conducted by Vassey and Doll
• 84 women as case of the disease and double
the controls i.e. without disease were
investigated
• 50% of the cases were taking OCPs as
compared to 14% of the controls
• Women on OCPs had 6 times more risk of
having venous thrombosis.

Thalidomide tragedy
• Thalidomide is non barbiturate hypnotic
• Data of 46 mothers who delivered deformed
babies and 300 mothers who delivered normal
babies was collected in 1961.
• 41 out of 46 mothers had history of
thalidomide intake during pregnancy
• None of the mother from control group of 300
had taken thalidomide
• Later on lab experiments also prove
thalidomide as teratogenic.

Definition
Cohort study is a type of analytical study which is usually undertaken to
obtain additional evidence to refute or support the existence of an
association between suspected cause and disease.
• Synonyms
 Longitudinal study
 Panel study
 Prospective study
 Forward looking study
 Incidence study

• What Is Cohort
 Ancient Roman legion, A band of warriors.
 A group of people who share a common
Characteristic or experience within a
defined time period e.g. age , occupation,
pregnancy etc

INDICATION OF A COHORT STUDY
• When there is good evidence of exposure and
disease.
• When exposure is rare but incidence of disease is
higher among exposed
• When follow-up is easy, cohort is stable
• When ample funds are available
• When attrition is minimal.

Design of Cohort Study
Then
(a+b) is called study cohort and (c+d) is called control cohort
106

Consideration during selection of
Cohort
• The cohort must be free from disease under study.
• Insofar as the knowledge permits, both the groups
should be equally susceptible to disease under study.
• Both the groups must be comparable in respect of all
variable which influence the occurrence of disease
• Diagnostic and eligibility criteria of the disease must
be defined beforehand. 107

Types of cohort study
• Prospective study
• Retrospective cohort study
• Ambi-directional cohort study
108

Prospective cohort study
• The common strategy of cohort studies is to start
with a reference population (or a representative
sample thereof), some of whom have certain
characteristics or attributes relevant to the study
(exposed group), with others who do not have those
characteristics (unexposed group).
• Both groups should, at the outset of the study, be
free from the condition under consideration. Both
groups are then observed over a specified period to
find out the risk each group has of developing the
condition(s) of interest.
109

Children
(<12 yrs)
1000
Family
smoker
500 children
Exposed
Family non-smoker
500 children
Not exposed
Diseased
300
Not diseased
200
Diseased
120
Not diseased
380
OutcomeStart
110

Problem of prospective study
• Study might take long duration.
• Sufficient amount of funding for long period.
• Missing of study subjects.
111

Retrospective Cohort Study
• A retrospective cohort study is one in which the
outcome have all occurred before the start of
investigation.
• Investigator goes back to the past to select study
group from existing records of the past
employment, medical and other records and
traces them forward through time from the past
date fixed on the records usually to the present.
• Known with the name of Historical Cohort and
noncurrent cohort
112

Example of Retrospective Study
• Suppose that we began our
study on association between
smoking habit and lung cancer
in 2008
• Now we find that an old roster
of elementary schoolchildren
from 1988 is available in our
community, and that they had
been surveyed regarding their
smoking habits in 1998.
• Using these data resources in
2008, we can begin to
determine who in this
population has developed lung
cancer and who has not.
113

Ambi-directional cohort Study
• Elements of prospective and retrospective
cohort are combined.
• The Cohort is identified from past records and
assesses of date for the outcome. The same
cohort is the followed up prospectively into
future for the further assessment of outcome
114

Example of Ambi-directional cohort
study
• Curt- Brown and Dolls study on effects of
radiation Began in 1955 with 13,352 patients
who received large dose of radiation therapy for
ankylosing spondylitis between 1934 to1954.
• Outcome evaluated was death from Leukemia or
aplastic anemia between 1934 to 1954.
• A prospective component was added up in 1955
and surviving subjects were followed up to
identify deaths in subsequent years
115

Comparison of retrospective and prospective
cohort study
116

Prognostic cohort studies
Prognostic cohort studies are a special type of cohort study used
to identify factors that might influence the prognosis after a
diagnosis or treatment.
These follow-up studies have the following features:
The cohort consists of cases diagnosed at a fixed time, or cases
treated at a fixed time by a medical or surgical treatment,
rehabilitation procedure, psychological adjustment.
By definition, such cases are not free of a specified disease, as in
the case of a conventional cohort
The outcome of interest is usually survival, cure, improvement,
disability, or repeat episode of the illness, etc.
117

1. Selection of study subjects
The usual procedure is to locate or identify the cohort,
which may be a total population in an area or sample
thereof. Cohort can be:
• community cohort of specific age and sex;
• exposure cohort e.g. radiologists, smokers, users of
oral contraceptives;
• birth cohort e.g. school entrants;
• occupational cohort e.g. miners, military personnel;
• marriage cohort;
• diagnosed or treated cohort, e.g. cases treated with
radiotherapy, surgery, hormonal treatment.
119

Open or dynamic cohort
• Open population or dynamic population describe a
population in which the person-time experience can
accrue from a changing roster of individuals.
• For example, in a study, the incidence rates of
cancer reported by the Connecticut Cancer Registry
come from the experience of an open population.
Because the population of residents of Connecticut
is always changing, the individuals who contribute
to these rates are not a specific set of people who
are followed through time.
120

Fixed and Closed Cohort
• Fixed Cohort :When the exposure groups in a
cohort study are defined at the start of follow-up,
with no movement of individuals between
exposure groups during the follow-up, the groups
are called fixed cohorts.
• If no losses occur from a fixed cohort, the cohort
satisfies the definition of a closed population and
is often called a closed cohort
121

2. Obtaining data on Exposure
• From Cohort Members : Personal interview,
mailed questionnaire
• Review of Records : Certain kinds of information
like dose of radiation, kinds of surgery received
can only be obtained from medical records.
• Medical examination/ Special tests: In some
cases information needs to be obtained from
medical examination like in case of blood
pressure, serum cholesterol,
• Environmental Survey of location where cohort
lives
122

Information should be collected in a manner
that allows classification of cohort according
to
• whether or not they have been exposed to
suspected factor
• According to level or degree of exposure
• Demographic variables which might influence
frequency of disease under investigation
123

3. Comparison Group
Internal Comparison
Group :
Single Cohort enters the
study and its members on
the basis of information
obtained , can be
classified into several
comparison according to
degree of exposure
Classification
of exposure
No. of
Deaths
Death rate
½ pack 24 95.2
½ to 1 pack 84 107.82
1-2 pack 90 229.2
+ 2 pack 97 264.2
Age Standardized death rate among
100000 men per year according to
amount of cigarette smoking
124

External Comparison Group: when information on
degree of exposure is not available.
if all workers at the factory had some degree of
exposure, we would need to select a comparison
group from another population, possibly another
type of factory
Comparison with general population can also be
used as comparison group
125

4. Follow UP
• The length of follow-up that is needed for
some studies to reach a satisfactory end-
point, when a large enough proportion of the
participants have reached an outcome, may
be many years or even decades.
• At the start of study, method should be
determined depending on the outcome of
study to obtain data for assessing outcome.
126

Procedure may be:
• Periodic medical examination of each member
of cohort
• Reviewing physician and hospital records
• Routine surveillance of death records
• Mailed questionnaire, telephone calls and
periodic home visits
127

5. Analysis
Data analyzed in terms of
• Incidence rate of outcome among exposed
and non exposed
• Estimation of risk
128

Incidence rate
Choice between cumulative incidence and Incidence Density
is a crucial issue
• Cumulative incidence: In cohort studies on acute diseases
with short induction periods and a short time of follow-up,
like outbreaks, the risk of disease can be estimated directly
using the cumulative incidence, given a fixed cohort with
fixed period of follow-up and a low fraction of drop-outs.
• Incidence Density: In cohort studies on chronic diseases
with their long follow-up periods, however, the use of the
cumulative incidence is not appropriate because usually
disease-free follow-up periods differ strongly among
cohort members. In such case incidence density is apposite
measure 129

Death No death Incidence
rate
Total
Exposed A B A/(A+B) A + B
Unexpos
ed
C D C/(C+D) C + D
Total A + C B + D A+B+C+
D
Outcome*
* Outcome : death/disease
ANALYSIS OF COHORT STUDIES
130

A = Exposed persons who later develop disease or die
B = Exposed persons who do not develop diseases or die
C = Unexposed persons who later develop disease or die
D = Unexposed persons who do not develop diseases or die
The total number of exposed persons = A + B
The total number of unexposed persons = C + D
Incidence of disease(or death) among exposed= A/A+B
Incidence of disease(or death) among non-exposed= C/C+D
131

Relative Risk (RR)
• Estimates the magnitude of an association between exposure
and disease
• Indicates the likelihood of developing the disease in the
exposed group relative to those who are not exposed
• Ratio of risk of disease in exposed to the risk of disease in
nonexposed
Relative Risk
RR =
Risk in exposed(Incidence in exposed group)
Risk in non exposed(Incidence in non exposed group)
132

Children
(<12 yrs)
1000
Family
smoker
500 children
Exposed
Family non-smoker
500 children
Not exposed
Diseased
300
Not diseased
200
Diseased
120
Not diseased
380
OutcomeStart
134

Rate: Incidence rate
•Incidence of Resp. Infection among exposed
children: 300
500 = 60%
•Incidence of Resp. Infect. Among non exposed
children: 120
500 = 24%
135

Cohort Study (cont.)Relative Risk: Incidence rate among exposed
Risk Ratio Incidence rate in non exposed.
60
24 = 2.5
Exposed individuals are 2.5 times more likely to
develop disease than non exposed individuals.
136

Difference Measures
• Attributable risk
– No. of cases among the exposed that could be eliminated
if the exposure were removed
= Incidence in exposed - Incidence in unexposed
• Population Attributable Risk percentage:
PAR expressed as a percentage of total risk
in population
100x
I
I-I
PAR%
population
unexposedpopulation
=
137

Attributable Risk
Incidence
Exposed Unexposed
Iexposed – Iunexposed
I = Incidence
138

Yes No Incidence RD
Yes 100 1900 2000 0.05
No 80 7920 8000 0.01
180 9820 10000
AR: Smoking and Lung cancer
Smoking
0.04
Lung Cancer
Attributable risk = Incidence in exposed - Incidence in unexposed
=0.5-0.1
=0.4
139

• Excess risk of disease in total population
attributable to exposure
• Reduction in risk which would be achieved if
population entirely unexposed
• Helps determining which exposures relevant
to public health in community
Population Attributable Risk (PAR)
unexposedpopulation I-IPAR =
140

Population Attributable Risk
Risk
Population Unexposed
unexposedpopulation I-I
Ipopln– Iunexposed
141

Yes No Risk
Yes 100 1900 2000 Incidence in exposed= 0.050
No 80 7920 8000 Incidence in unexposed=0.010
180 9820 10000 Incidence in population=0.018
PAR: Smoking
44%100x
0.018
0.010-0.018
PAR% ==
0.0080.010-0.018PAR ==
Smoking
Lung Cancer
142

Conclusion:
44% of lung cancer in the population could be
prevented if use of smoking were eliminated
143

But calculations
are
not that simple in real Cohort studies
144

British Doctors Study
• In 1951, a prospective cohort study was set up among British
doctors to investigate the relationship between smoking and
mortality, particularly the association between smoking and lung
cancer
• In 1951, a questionnaire on smoking habits was sent to 49,913 male
and 10,323 female doctors , 34,440 male doctors and 6194 female
doctors gave sufficient information to classify their smoking status.
• The causes of death of 10,072 male and 1094 female doctors who
had died during this period were ascertained from death
certificates.
• The rate of death from lung cancer among smokers was compared
to that among non-smokers.
145

Since mortality depends on age and the distribution of subjects by age group
is different between the smokers and non-smokers, the effect of age on
mortality has to be adjusted for when making comparison on lung cancer
mortality between these two groups. A commonly used method to adjust for
the age is direct standardization
147

It would not be rational to categorize individual
smoking one cigarette per day and more than 25
cigarette in same category with equal emphasis
So
Its better we opt for stratification
148

Again its not only the dose of exposure that determines the frequency of
disease, there are some other factors like duration of exposure and age at
initiation of exposure that can influence occurrence of disease. We need to
make adjustment for that too
149

The relative risk of lung cancer death increased with the level of smoking in
both males and females. The relative risk in the men smoking 1–14 and 15–
24 cigarettes per day is much higher than in the women; in the group
smoking 25 or more cigarettes per day, the relative risk in men is marginally
less than that in women. Does this mean that the effect of low levels of
smoking is higher among men than among women?
150

The proportion of men inhaling smoke is higher than women in all three levels of
smoking. Men seemed to have started to smoke at an earlier age than women.
Since these features of smoking may modify the effect of smoking on lung cancer,
their effects have to be adjusted for when comparing the association between
smoking and lung cancer in men and women.
151

…….. too complicated ????
But
Problem does not end here….
152

What if, a subject is followed up from age 23 but has been exposed from age 19
on, he|she is exposed until age 27 followed by an unexposed 5 year period. He|she is
again exposed until age 39 at which time his|her person-time at risk ceases either
because of disease diagnosis or because of end of follow-up.
153

For analyzing such data we use Poisson models
and Cox Proportional Hazards
Specialized software packages exist to perform
these computations such as Stata (Version 7
or later and Epicure
154

Advantage of Cohort Studies
• Temporality can be established
• Incidence ca be calculated.
• Several possible outcome related to exposure
can be studied simultaneously.
• Provide direct estimate of risk.
• Since comparison groups are formed before
disease develops certain forms of bias can be
minimized like misclassification bias.
• Allows the conclusion of cause effect
relationship 155

Disadvantage of Cohort Studies
• Large population is needed
• Not suitable for rare diseases.
• It is time consuming and expensive
• Certain administrative problems like loss of staff,
loss of funding and extensive record keeping are
common.
• Problem of attrition of initial cohort is common
• Study itself may alter people’s behavior
156

Ethics in Cohort Study
• Classic example issues on research ethics is
Tuskegee study on natural history of syphilis in
which US Public health service recruited 399 poor
black sharecroppers in Macon County as cohort.
• Study was lasted from 1932 to 1972.
• They were denied of treatment of syphilis
although effective treatment was available.
Government deceived by saying that they were
being treated.
157

• On July 26, 1972, The New York Times described
the study as “the longest non therapeutic
experiment on human beings in medical history.”
The disclosure of this study by the press was a
major scandal in the United States.
• Led to The Belmont Report: Ethical Principles and
Guidelines for the Protection of Human Subjects in
Research
158

• These problems can be encountered in cohort
study designed to study natural history of disease.
• What if treatment becomes available in the
middle of research, should we continue research
with treatment denial of abort research?
• Should we communicate the research finding to
individuals are controversial issues.
159

Biases in cohort study
Differential loss of follow up
Differential follow-up between compared groups
may be a major problem. Losses to follow-up,
whether due to study withdrawals, unmeasured
outcomes, or unknown reasons, are always a
concern.
This is particularly true when more outcome data is
missing in one group than another, as there is no
way to be certain that the factor being studied is
not somehow related to this observation.
160

Contamination
Subjects initially unexposed to the risk factor of
interest may become exposed at a later date.
Such “ contamination ” tends to reduce the
observed effect of the risk factor.
161

Selection Bias
Perhaps the largest threat to the internal validity of a
cohort studies is selection bias, also called case-mix
bias .
Select participants into exposed and not exposed groups
based on some characteristics that may affect the
outcome
Information bias−
Collect different quality and extent of information from
exposed and not exposed groups
162

Misclassification Bias
Differential misclassification
Non differential misclassification
163

• Differential misclassification – Errors in
measurement are one way only
– Example: Measurement bias – instrumentation may
be inaccurate, same cut off level of weight for male
and female to determine malnourishment
164

Misclassification Bias (cont.)
250100150
1005050Nonexposed
15050100Exposed
TotalDisease-Disease +
RR = a/(a+b)/c/(c+d) = 1.3
True Classification
250100150
905040Nonexposed
16050110Exposed
TotalDisease -Disease +
RR = a/(a+b)/c/(c+d) = 1.6
Differential misclassification - Overestimate exposure
for 10 cases, inflate rates
165

• Nondifferential (random) misclassification –
errors in assignment of group happens in more than
one direction
– This will dilute the study findings -
BIAS TOWARD THE NULL
166

Misclassification Bias (cont.)
Disease + Disease - Total
Exposed 100 50 150
Nonexposed 50 50 100
150 100 250
RR = a/(a+b)/c/(c+d) = 1.3
True Classification
Disease + Disease - Total
Exposed 110 60 170
Nonexposed 40 40 80
150 100 250
RR = a/(a+b)/c/(c+d) = 1.3
Nondifferential misclassification - Overestimate
exposure in 10 cases, 10 controls – bias towards null
167

Control of Bias
• Restriction
• Stratification
• Mathematical Modeling
-Poisson regression model
-Cox proportional hazard
168

When Is a Cohort Study Warranted?
• When the (alleged) exposure is known
• When exposure is rare and incidence of disease
among exposed is high (even if the exposure is
rare, determined investigators will identify
exposed individuals)
• When the time between exposure and disease is
relatively short
• When adequate funding is available
• When the investigator has a long life expectancy
169

Classic example of Cohort study :
Study on London Cholera Outbreak
• The classical study on the London cholera
epidemic of 1849 conducted by John Snow is an
example of a cohort study on infectious diseases .
• Two different water companies (the Lambeth and
the Southwark & Vauxhall) supplied households
within various regions of London
170

Classic example of Cohort study :
Study on London Cholera Outbreak
• The companies differed in one important feature, the
location of the water intake. The Lambeth had moved their
water intake upstream from the sewage discharge point in
1849; whereas, the Southwark & Vauxhall continued to
obtain water downstream of the sewage discharge point.
• Dr. Snow classified households according to their exposure
to the two water sources and showed a substantial
difference in cholera mortality, 315 versus 37 cholera
deaths per 10,000 households served by the Lambeth and
Southwark & Vauxhall companies, respectively.
171

Cohort study
Advantages Disadvantages
• Can often show temporality of
relationship
• Less bias due to prospective
evaluation of exposures
• Can evaluate multiple diseases
• can establish cause - effect
• good when exposure is rare
• We can find out incidence rate
and Relative risk.
• losses to follow-up
• often requires large sample
• ineffective for rare diseases
• long time to complete
• expensive
• Changes in diagnostic criteria
over time.
• Need motivated cohort of
people who will be
repeatedly evaluated

Think Epidemiologically…….
Thank You…Thank You…

Principles of epidemiology

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Principles of epidemiology

Semelhante a Principles of epidemiology (20)

Último

Último (20)

Principles of epidemiology

Notas do Editor