Research Methods

Research methods
“an overview”
Dr. Tarek Tawfik
Professor of Public Health
Cairo University

12/9/2013

Dr. Tarek Tawfik

Research?
 More than a set of skills, it is away of thinking:

examining critically the various aspects of day to day
professional work;
 Understanding and formulating guiding principles that
govern a particular procedures;
 Developing and testing new theories for the
enhancement of your practice.
It is the habit of questioning with systematic
examination of the observed information to find
answers which may results in more effective
professional services. Kumar R 2005.
12/9/2013

Dr. Tarek Tawfik

Definition:
Research is a structured inquiry that utilizes acceptable
scientific methodology to solve problems and creates new
knowledge that is generally applicable. Grinnell 1993

12/9/2013

Dr. Tarek Tawfik

Types of research

Application

Pure
research

Applied
research

Objectives

Descriptive
research

Correlational
research

Inquiry mode

Exploratory
research

Explanatory
research

Quantitative
research
Qualitative
research

Research process “the 8 steps model”
FINER

Research design:
functions

Literature
review

Formulating
a research
question

Methods and
tools of data
collection

Instruments
for data
collection

Research
design

Methods of data
Processing:
computing
and statistics

Sampling theory
and designs

Selecting
a sample

Research
protocol
writing

Data
collection

Editing
Study designs

Variables and
hypotheses: definition
and typology

What

How

Data
processing

Research
report

Coding
Code
book

Field test
of the tools

Validity and
reliability of the
research tool

Principles of
Scientific writing

Contents of
research proposal

Conducting of the study

The structure of a research project is set out in
its protocol, the written plan of the study.
The functions of the protocol are:
 Seeking grant funds.
 Helping the investigator to organize his research in a
logical, focused, and efficient way.

12/9/2013

Dr. Tarek Tawfik

Elements of protocol
Research questions
Significance (background)
Design
time frame
epidemiologic approach
Subjects
selection criteria
sampling design
Variables
predictor variables
confounding
outcome variables
Statistical issues
hypotheses
sample size
analytic approach

Purpose
What questions will the study address?
Why are these questions important?
How is the study structured?

Who are the subjects and how will they be
selected?
What measurements will be made?

How large is the study and how will it be
analyzed?

I- Conceiving the Research Question.
The research question is the uncertainty about
something in the population that the
investigator wants to resolve by making
measurements on his study subjects.

No shortage of questions as one leads to
another.

12/9/2013

Dr. Tarek Tawfik

Tamoxifen and Cancer Breast.
Tamoxifen reduces the risk of cancer breast
during 4 years of use by women at high risk of
breast cancer.
Many other questions evolved:

o Does tamoxifen reduce the risk of death due to breast
cancer?
o How long should treatment be continued?
o Might other drugs with the same action are beneficial
without the risk of tamoxifen-induced thromboembolism?
o Does the use of such drug increases the risk for other cancer
(ovarian)?
The difficulty in question lies in
finding one that can be transformed
into a feasible and valid study plan.
12/9/2013

Dr. Tarek Tawfik

Origins of a research question.
 For established investigator:

The best research questions usually emerge from
findings and problems faced and observed in prior
studies, and in those of other workers in the field
“Major Players”.
 For new and other investigators:
☼
Mastering of the literature.
☼
Being alert to new ideas and techniques.
☼
Keeping the imagination roaming.
☼
Attending seminar, workshops and conferences.

12/9/2013

Dr. Tarek Tawfik

Characteristics of a good research question “FINER Criteria”.
Feasible

Interesting
Novel

Ethical
Relevant

Adequate number of subjects.
Adequate technical expertise
Affordable in time and money
Manageable in scope
To the investigator
Confirms or refuses previous findings
Extends previous findings
Provides new findings
To scientific knowledge
To clinical and health policy
To future research directions

Developing the research question and study plan.

☼ A one or two page outlining the study question

and the study plan at an early stage is very
helpful.
☼ This will focus the attention to clarify the ideas
about the plan and to discover potential specific
problems that need correction.

12/9/2013

Dr. Tarek Tawfik

The research question should specifies!
Predictor

Exposure

Smoking

Confounders

Confounders

Occupational hazards

Outcome

Disease

Cancer lung

The research question and study plan: problems and solutions

Potential problem
The research question is not FINER
1- Not feasible
too broad
not enough subjects available

methods beyond the skills of the
investigator
too expensive
2- Not interesting, novel, or relevant
3- Uncertain ethical suitability
The study plan is vague

Solutions
Specify a smaller set of variables
Narrow the question.
Expand the inclusion criteria
Eliminate or modify exclusion criteria
Add other sources of subjects
Lengthen the time frame for entry into study
Use strategies to decrease sample size
Collaborate with those who have skills
Consult and review the literature for alternative
methods
Consult and modify the research question

Exercise:
Consider the following research questions.
First, write each question in a single sentence
that specifies a predictor, outcome, and
population.
Then discuss whether it meets the FINER
criteria.
Rewrite the question in a form that overcomes
any problems in meeting their criteria.
12/9/2013

Dr. Tarek Tawfik

Exercise:
A.
B.
C.
D.
E.
F.

What is the relationship between depression and
health?
Does eating red meat cause cancer?
Does lowering serum cholesterol prevent heart
disease?
Can a relaxation exercise decrease the anxiety
associated with mammography?
Do contraceptive vaginal sponges prevent HIV
infection?
Does dietary pattern among school children affect
their health?

12/9/2013

Dr. Tarek Tawfik

Assignment:
Formulate a research questions regarding health
and health-related problems that may be
encountered in:
A.
B.
C.

Rural community and the available health facilities.
Urban primary health care facility.
Primary schools.

12/9/2013

Dr. Tarek Tawfik

II- Rationale (Significance).
This section sets the proposed study in context and gives
its rationale:




What is known about the topic at hand?
Why is the research question important?
What kind of answers will the study provide?

12/9/2013

Dr. Tarek Tawfik

Rationale “Background”
۞This section cites previous research that is relevant
(including the investigator‟s own work) and indicates
the problem with that research and what question
remain.
۞It makes clear how the findings of the proposed study
will help
o In resolving uncertainties,
o Leading to new scientific understanding and
o Influencing clinical and public health policy.

12/9/2013

Dr. Tarek Tawfik

Sequence of the rationale
In a concise logical sequence:
 Discuss the importance of the topic

“significance”
 Review the relevant literature and current
knowledge (including deficiencies in
knowledge that make the study worth doing).
 Describe any results you have already obtained
in the area of the proposed study.

12/9/2013

Dr. Tarek Tawfik

Sequence of the rationale
Indicate how research question has emerged
and fits logically with the above.
Outline in broad terms how you intend to
address the research question.
Explain how your study will add to
knowledge and help to improve health
and/or save money.
12/9/2013

Dr. Tarek Tawfik

How to determine research priorities?
(Importance/Significance)
I- How frequent is the condition relative to other
conditions?
Prevalence
As a cause of death
II- What is the degree of disability or dysfunction due to
the condition?
III- Are there cost-effective means to cure, control, or
prevent such condition?

12/9/2013

Dr. Tarek Tawfik

Assignment:
State the rationale (significance) for the
proposed study question?

12/9/2013

Dr. Tarek Tawfik

III-Setting up research objectives.
Purpose broad objectives (aims)
☼

The statement of a research project should describe the
main questions to be addressed by the research without

going into details.
☼

It should give a reader a clear idea of the nature of the
research that will be undertaken.

„ The purpose is to measure the effect of a plasmodium falciparum asexual
blood-stage vaccine in reducing morbidity and mortality due to malaria‟
„ This study is conducted to assess the nutritional problems among primary
school children‟
12/9/2013

Dr. Tarek Tawfik

Specific objectives
The specific objective should be

S
M
A
R
T

12/9/2013

SMART:

Specific
Measurable (effect size)
Applicable, achievable
Relevant
Timely (a time frame and end point).

Dr. Tarek Tawfik

Objectives “characteristics”

Clear

Complete
+

+

Specific

+

Identify the
Main variables
to be correlated

Descriptive studies
Correlation studies (experimental and non experimental)

Hypothesis-testing studies

+

Identify the
direction of the
relationship

Specific objectives in research
They should include a concise but detailed
description of:
o The intervention (study) to be evaluated,
o The outcome (s) of interest,
o And the population in which the study will be
conducted.

12/9/2013

Dr. Tarek Tawfik

Why is asthma among children in Istanbul
exceptionally frequent?

The purpose of the study are to determine if the
excess asthma in Istanbul is related to a
combination of genetic predisposition
(estimated by atopy) and socio-economic and/or
indoor air pollution.

12/9/2013

Dr. Tarek Tawfik

What are the specific objectives to achieve such
type of study?
I.

Identify a suitable source of childhood asthma cases and select
200 cases, following a specific case definition.

II.

Identify and select suitable control subjects (individuals without
asthma).

III.

Measure indoor particulate exposure on each of 3 randomly
selected days for each participant.

IV.

Perform allergy skin test on cases and controls (atopy).

V.

Record personal, demographic, and socio-economic information
about cases and control.

VI.

Compare risk ratio for atopy, low socio-economic status, and
increased indoor air pollution between cases and controls.

Hypotheses and
Underlying Principles
Dr. Tarek Tawfik

12/9/2013

Dr. Tarek Tawfik

Hypothesis definition
A hypothesis is written in such a way that it can be proven or
disproved by valid and reliable data-it is in order to obtain these data
that we perform our study. Grinnel 1988:200.
Hypothesis has certain characteristics:
1.
It is a tentative proposition “hunch”
2.
Its validity is unknown.
3.
In most cases, it specifies a relationship between two or more
variables.

12/9/2013

Dr. Tarek Tawfik

Functions of hypothesis
 Formulation of a hypothesis provides a study with focus

“specific aspects of a research problem to investigate”
 What data are necessary to collect to test the hypothesis.
 Enables you to specifically conclude what is true or what is
false.
Process of testing a hypothesis
Phase I
Formulate your
Hunch or
assumption
12/9/2013

Phase II
Collect the
required
data
Dr. Tarek Tawfik

Phase III
Analyze data
To draw conclusions
About the hunch-true/false

Hypotheses
It is the further formulation of the study
question into a final and more specific
version, that summarizes

the elements of the study;

the sample, the design,

and the predictor and outcome variables.
The primary purpose is to establish the basis for
tests of statistical significance.
12/9/2013

Dr. Tarek Tawfik

Hypotheses
I- Hypotheses are not needed in descriptive studies
which describe how characteristics are distributed
in a population.

The prevalence of particular genotype among
patients with hip fracture.
II- Hypotheses are needed in most of the observational
and experimental studies that address statistical
comparison.

The study of weather a particular genotype is more
common in patients with hip fracture compared to
control.
12/9/2013

Dr. Tarek Tawfik

Hypotheses
If any of the following terms appear in the
research question, then the study is not descriptive
and a hypothesis should be formulated:
Greater than, less than, causes lead to, compared with, more
likely than, associated with, related to, similar to, or
correlated with.

12/9/2013

Dr. Tarek Tawfik

Characteristics of a good hypothesis
Simple, Specific, Stated in advance (3Ss)

A-Simple versus complex
Contains one predictor and one outcome variable;
(a sedentary lifestyle is associated with an increased risk of
proteinuria in patients with diabetes).

A complex hypotheses contains more than one

predictor;
(a sedentary lifestyle and alcohol consumption are associated
with increased risk of proteinuria in patients with diabetes).
12/9/2013

Dr. Tarek Tawfik

Simple hypotheses
Or more than one outcome variable;

(alcohol consumption is associated with an increased risk of
proteinuria and neuropathy in patients with diabetes).

Complex hypotheses can be readily tested with a
single statistical tests and can be easily approached by
breaking them into two or more simple hypotheses.

12/9/2013

Dr. Tarek Tawfik

Simple hypotheses
(smoking cigarettes, cigars, or a pipe is
associated with an increased risk of proteinuria
in patients with diabetes).

What type of hypotheses is this?

12/9/2013

Dr. Tarek Tawfik

B-Specific versus Vague




A specific hypothesis leaves no ambiguity about the
subjects, the variables, or about how the test of statistical
significance will be applied.
it uses concise operational definitions that summarize the
nature and source of the subjects and how variables will be
measured;
(a history of using tricyclic antidepressant medications, as
measured by review of pharmacy records, is more common in
patients hospitalized with an admission diagnosis of
myocardial infarction at Longview Hospital in the past year
than in control hospitalized for pneumonia).
12/9/2013

Dr. Tarek Tawfik

Specific versus Vague


It is often obvious from the research hypothesis
whether the predictor variable and the outcome
variable are dichotomous, continuous, or
categorical.
(alcohol consumption (in mg/day) is associated with an
increased risk of proteinuria (> 30 mg/dL) in patients with
diabetes).

12/9/2013

Dr. Tarek Tawfik

C-In Advance versus After-the-Fact






The hypothesis should be stated in writing at the
outset of the study.
A single pre-tested hypothesis creates a stronger basis
for interpreting the study results than several
hypotheses that emerge as a result of data inspection.
Hypotheses that are formulated after data examination
are a form of multiple hypothesis testing that often
leads to over-interpreting the importance of the
findings.

12/9/2013

Dr. Tarek Tawfik

Types of hypothesis
Alternate hypothesis

Null hypothesis

Research hypothesis

Hypothesis
of difference
Hypothesis
of no difference
“null hypothesis”

Hypothesis
of pointprevalence

Hypothesis
of association

Types of hypothesis “examples”
There is no significant difference in the proportion of
male and female smokers in the study population.
Hypothesis is ?
A greater proportion of females than males are smokers
in the study population. Hypothesis is ?
A total of 60% of females and 30% of males in the study
population are smokers. Hypothesis is ?
There are twice as many female smokers as male
smokers in the study population. Hypothesis is ?
12/9/2013

Dr. Tarek Tawfik

Types of Hypotheses
1- Null and Alternative
I- The null hypothesis states that there is no association
between the predictor and outcome variables in the
population.
(there is no difference in the frequency of drinking well water
between subjects who develop peptic ulcer disease and those who
do not).

II- It is the formal basis for testing statistical significance.
Statistical tests help to estimate the probability that an
association observed in a study is not due to chance.

12/9/2013

Dr. Tarek Tawfik

Null and Alternative

o The proposition that there is an association is
called the alternate hypothesis.

o The alternative hypothesis cannot be tested
directly; it is accepted by default if the test of
statistical significance rejects the null
hypothesis. “accepted when null is rejected”

12/9/2013

Dr. Tarek Tawfik

2- One and Two-sided alternative Hypothesis
I- A one-sided hypothesis specifies the direction of
the association between the predictor and the
outcome variables.
Drinking well water is more common among subjects who
develop peptic ulcer (one-sided).

II- A two-sided hypothesis states only that an
association exists; does not specify the direction.
The prediction that subjects who develop peptic ulcer
disease have a different frequency of drinking well water
than those who do not (two-sided).
12/9/2013

Dr. Tarek Tawfik

Indications
For one-sided:
 When only one direction for an association is
important or biologically meaningful (a new drug

for hypertension is more likely to cause rashes
than a placebo).


When there is good evidence from prior studies
that an association is unlikely to occur in one of
the two directions (smoking affects the risk of

cancer brain).
12/9/2013

Dr. Tarek Tawfik

Underlying Statistical Principles

Research Q

implement

design
Target
Population
Phenomena
Of interest

Truth in
Universe

Actual study

Study plan

Random
Systematic
error

infer

Intended
Sample
Intended
variables

Truth in
the study

Random
Systematic
error

infer

Actual
Subjects
Actual
Measure.

Findings
in the study

Underlying Statistical Principles
Jury decision

Statistical tests

Innocence: the defendant did not
counterfeit money
Guilt: the defendant counterfeit

Null hypothesis: there is no association between dietary
carotene and incidence of colon cancer.
Alternative hypothesis: there is an association between

money

dietary carotene and colon cancer incidence.

Standard for rejection null hypothesis:
Standard for rejecting innocence:
beyond a reasonable doubt.
Correct judgment: convict a
counterfeiter
Correct judgment: acquit an innocent
person
Incorrect judgment: convict an
innocent person
Incorrect judgment: Acquit a
counterfeiter

Level of statistical significance ( ≤ 0.05)

Correct inference: conclude an association when one does
not exist in the population.
Correct inference: no association between carotene and
colon cancer when one does not exist

Incorrect inference (Type I error): association in the study
when actually is none

Incorrect inference (Type II error): there no association
when actually there is one.

Type I and type II error
A type I error (false-positive) occurs if the
investigator rejects a null hypothesis that is actually
true in the population.
A type II error (false-negative) occurs if the
investigator fails to reject a null hypothesis that is
actually not true in the population.

12/9/2013

Dr. Tarek Tawfik

Truth in the population Vs. the results in the study
sample (the four possibilities).

Truth in the population
Results in the study sample
Reject null hypothesis
Fail to reject null

12/9/2013

Association between
predictor and outcome

No association between
predictor and outcome

Correct

Type I error

Type II error

Correct

Dr. Tarek Tawfik

, and Power
 The probability of committing a type I
error (rejecting the null when it is
actually true) is called  (alpha), another
name is the level of statistical
significance.

 An  level of 0.05, setting 5 % as the
maximum chance of incorrectly
rejecting the null hypothesis.
12/9/2013

Dr. Tarek Tawfik


 The probability of making a type II error (failing to reject the
null hypothesis when it is actually false) is called  (beta).
 The quantity (1-  ) is called power, the probability of
rejecting the null hypothesis in the sample if the actual effect
in the population equals effect size.
 If  is set at 0.10, we are willing to accept a 10 % chance of
missing an association of a given effect size. This represents a
power of 90 % (there is 90 % chance of finding an association
of that size).

12/9/2013

Dr. Tarek Tawfik

P Value


A „non significant‟ result (i.e., one with a P value
greater than ) does not mean that there is no
association in the population, it only means that

the result observed in the sample is small
compared with that occurred by chance
alone.


Those with hypertension were twice as likely to
develop cancer prostate compared to normotensive
subjects (P of 0.08)

12/9/2013

Dr. Tarek Tawfik

Sampling

Dr. Tarek Tawfik

12/9/2013

Dr. Tarek Tawfik

In research what we are looking for?
The variable: is a condition, quality or trait that
varies from one case to another Provokes research
In the target population (population of interest)
To study these
variables.

Either include the
whole
Population

OR

A
Sample

Basic Terms and Concepts

Target Population and Sample
o A population is a complete set units with a specified set of

characteristics while a sample is a subset of that population.
o The defining characteristics of population include geographic,
clinical, demographic and temporal.
o Clinical and demographic characteristics define the target
population, the large set of people throughout the world to
which the results will be generalized.
(all teenagers with asthma).
o The study sample is the subset of the target population available
for study.
(teenagers with asthma in the investigator‟s town in 2005).

Steps in designing the protocol for choosing the
study subjects
Research question

Study plan

Target
population
Specify clinical,
Demographic and then
Geographic and temporal
characteristics

Truth in the Universe

Design

Intended sample
Specify accessible
population and
approach to selecting
the sample

Findings in the study

Selection Criteria

 How

would you define the population to be studied?

 Through

establishing selection criteria that include
inclusion and exclusion criteria.

Example:
Demonstrate the selection criteria for subjects to
evaluate the efficacy of calcium supplements for
preventing osteoporosis?
12/9/2013

Dr. Tarek Tawfik

Designing selection criteria for a clinical trial of calcium
supplements to prevent osteoporosis

Considerations
Inclusion
criteria
(be specific)

Specifying the characteristics
that define population that are
relevant to the research
question and efficient for
study:
®Demographic: age, sex, and
race.
®Clinical characteristics.
®Geographic (administrative).
®Temporal

characteristics

Example
A 5-year trial of calcium
supplementation for preventing
osteoporosis might specify the
subject be:
White females 50 to 60 years old

In good general health**
Patients attending clinic at X
Hospital
Between Jan. 1st and December
31st of next year.

Designing selection criteria for a clinical trial of calcium
supplements to prevent osteoporosis

Considerations
Exclusion
Criteria
(be parsimonious)

Example

Specifying the subsets of the
population that will not be
studied because of:

The calcium supplementation trial
might exclude subjects who are:

A

oAlcoholic

high likelihood of being
lost to follow-up.
An inability to provide good
data.
Being at high risk of side
effects.
Characteristics that make it
unethical to withhold the
study treatment

or plan to move of the
country or region.
oDisoriented or have a language
barrier.
oSarcoidosis /hypercalcemia.
oTaking

steroids.

Clinical versus Community populations
If the research question involves
patients with a disease;
hospitalized or clinic-based
patients are inexpensive and easy
to recruit, but selection factors
that determine who comes to the
hospital or clinic may have an
important effect.
Tertiary clinics tend to
accumulate patients with
serious forms of disease.

12/9/2013

In choosing the sample in the
community who will
represent a non clinical
population (populationbased)
Samples are difficult and
expensive to recruit, but they
are particularly useful for
guiding public health and
clinical practice in the
community.

Dr. Tarek Tawfik

Studying The whole population





Resorted to if we are interested in the characteristics of each
individual, particularly with descriptive research questions, and
there is a need for generalizing the findings.
Probability sampling is the gold standard.
It provides a rigorous basis for estimating the fidelity with
which phenomena observed in the sample represent those in
the population, and for computing statistical significance and
confidence intervals.
A.
B.
C.
D.

It is expensive.
It is time consuming.
It has higher error chances because of the many persons,
equipments and wide geographic area covered.
Carried out in censuses.

Sampling
Resorted to if we are interested in studying the prevalence of a
problem, associations or intervention effect,…..etc
A.
B.
C.

D.
E.

It is less expensive.
It is less time consuming.
It has lower error chances because of less persons,
equipments and geographic area covered.
Only estimates are concluded, the reality is unknown.
It allows for continuous study of the population
“longitudinal studies”.

Study of a sample is carried out in the majority of
biomedical researches.
12/9/2013

Dr. Tarek Tawfik

The concept of sampling
Study population:

Sampling units

You select a few sampling units
from the study population

You make an estimate “prediction”
extrapolated to the study population
(prevalence, outcomes etc.)

12/9/2013

Dr. Tarek Tawfik

Sample

You collect information
from these people to
find answers to your
research questions.

Principles of sampling
In a majority of cases of sampling there will be a
difference between the sample statistics and the true
population mean, which attributable to the selection of
the units in the sample “sampling error”.
II. The greater the sample size, the more accurate will be
the estimate of the true population mean “reduction in
sampling error”
III. The greater the difference in the variable
“heterogeneous variable” under study in a population
for a given sample size, the greater will be the
difference between the sample statistics and the true
population mean “the larger the sampling error”.
I.

Types of sampling
Non-random/probability

Random/probability

Simple

Stratified

Cluster

Quota

Mixed sampling
Systematic
sampling
Judgmental

Proportionate

Disproportionate

Single

Accidental

Double stage

Multi-stage

Snowball

Types of Samples


Probability samples:
Units are selected according to probability laws i.e.

everyone in the underlying population has an equal (a
specified) and independent chance of appearing in
that sample.


Non-probability (convenience) samples:
Units are selected based on known factors.
In clinical research the study sample is usually made up of
people who meet the inclusion criteria and are easily
accessible to the investigator.

12/9/2013

Dr. Tarek Tawfik

Probability Samples
In order to be able to infer from sample results to the
underlying population, that sample should be a
representative sample.

i.e. it should represent the population from which it is
drawn in every respect.
Because we can not anticipate all characteristics of
the population that the sample should
represent, we chose a probability (random)
sample.

12/9/2013

Dr. Tarek Tawfik

How to draw a probability Sample?
Identify the study units
(individuals, villages, houses, …etc).
II. Make a complete list of the study units in the
underlying population. That complete list is
known as the sampling frame.
III. Each of these units is given a number.
IV. Then select the required number of units (sample
size) at random from that frame.
I.

12/9/2013

Dr. Tarek Tawfik

The selection of units can be made either by:
1.

2.
3.

The lottery method “fishbowl draw” (the
numbers of frame units are written on
identical pieces of papers, mixed
thoroughly in a bowl and the required
number is blindly picked up).
Through the use of random numbers tables.
Computer generated random numbers.

Two systems of drawing a random sample:
Sampling without replacement.
Sampling with replacement.

Random Sampling Techniques

1-Simple random sample
2-Stratified random sample
3-Systematic random sample
4-Cluster random sample
5-Multistage random sample

12/9/2013

Dr. Tarek Tawfik

1-Simple random sample
We prepare a complete and up-to-date list of the underlying
population (sample frame). The specified sample size is drawn
from that frame at random.

Disadvantages:









12/9/2013

Suitable for homogenous population (single sex).
Larger sample size is required.
More expensive as we have to get the cases from
widely scattered areas.
Time consuming and more laborious.
Some groups might not be represented in the sample.
Extreme values can occur by chance.

Dr. Tarek Tawfik

Example of Simple random sample using random digit table.

Draw at random a sample size of 50 from a
population of 10,000.
A.
B.
C.

D.
E.
F.

The size of the population is 10,000 i.e. it is formed of 5 digits.
Select at random a page from the random numbers table
Select 5 adjacent columns
Proceed from up down, any value falling between 00001 and
10,000 is chosen and so on until you completed your 50 cases.
Duplicate numbers are left aside
Individuals with those 50 numbers compose our sample.

The First 15 columns of the first page of a Random
numbers table

26804

00010

93445

90720

12805

58563

85027

32242

86468

09362

16212

00128

64590

75362

32348

29273

34703

23763

96215

01556

63708

59207

22211

48522

49674

01534

98685

04104

00047

14986

2-Stratified random sampling
o Based upon the logic of heterogeneity of the

included variables.
o Ensure homogeneity of sub-population though
ranking them into strata.

12/9/2013

Dr. Tarek Tawfik

2-Stratified random sample








Ensures representativeness with regard to important
characteristics as age, sex, educational or socioeconomic levels.
The population is divided into strata (subgroups)
according to the different levels of the important
variable. The population in each stratum is
homogenous so sampling accuracy is increased.
We choose a simple random sample from each
stratum, the size of which is proportionate to the size
of that stratum.
In other words the sampling fraction is the same for
each stratum and the total sample.
n
n1
n2
n3



N
N1
N2
N3

Example of Stratified random sample
A town with a total population of 12,000 was classified into 4
homogenous socioeconomic strata. The population in each
stratum was 2,000 (class I), 4,000 (class II), 5,000 (class III)
and 1,000 (class IV) respectively. A sample size of 600 is to be
drawn from the town. Calculate the number of individuals to be
drawn at random from each of the 4 strata?

Sampling fraction 



1
20

Stratum1 sample  2000 x

1
20

 100

Stratum2 sample  4000

x

1
20

 200

Statum3 sample  5000

x

1
20

 250

Stratum4 sample  1000

x

600
12 , 000

1
20

 50

3-Systematic random sample
1.

The underlying population is classified into intervals:

The size of intervals = the size of the population the
required sample size.
2.

3.

The first case is selected at random from the first stratum
(interval) and the others are selected by adding
systematically the size of each interval.
Accordingly we are taking each (nth) individual. n is the
size of the interval. If the latter is 10 we take every tenth
observation

12/9/2013

Dr. Tarek Tawfik

Example of systematic random sample

1000 patients visit King Faisal University outpatient
clinics every day. We need a systematic random
sample of 100 patients. Explain how should we
proceed in selecting those 100 patients composing
our sample?
We classify the patients into 100 intervals and select a patient from
each.
Size of each interval =1000/100 = 10
Choose at random a number that lies between 1 and 10 say 9.
Choose from the second interval patient number 19th.
Choose from the third interval observation number 29th.
9  1x10  19 th

9  2 x 10  29 th
12/9/2013

OR

OR
Dr. Tarek Tawfik

9  10  19 th

19  10  29 th

4-Cluster random sample
۞ In this method, the sampling units are clusters (groups) of
individuals – (incomplete sampling frame and/or the total
sampling population is large) rather than individuals.
۞ The clusters (schools, houses, villages, …etc.) form the sampling
frame, from which the required number of clusters is selected at
random.
۞ All individuals in a cluster, a specific group, or a random sample
of them are included.
۞ Very useful when the population is widely dispersed, and it is
impractical to list and sample from all its elements.
12/9/2013

Dr. Tarek Tawfik

Example of random cluster sample
In some research, the objective was to study the
prevalence of malnutrition among primary school
children in Hofuof. There are 200 primary schools
in Hofouf. The estimated sample size is 20
clusters.
Describe how would you proceed in drawing such
sample?
A.
List all 200 schools
B.
Give each a number
C.
Use the random numbers tables in selecting the
20 schools whose numbers will fall between 001
and 200.

12/9/2013

Dr. Tarek Tawfik

5-Multistagerandom sample
We use this method if the target population is spread
over wide geographic area and there is limited budget or
resources (in community-based surveys).
In this method, the sample is drawn in many stages.
The area is divided into smaller clusters, the clusters are
divided into smaller clusters and so on. Random
selection is carried out at each level successively.

12/9/2013

Dr. Tarek Tawfik

You were asked to head a research team to
investigate the problem of handicapping in
K.S.A. How would you proceed in drawing
your sample?
List all governorates








12/9/2013

Select 4 governorates at random
List the districts in each of the 4 governorates
Select a district from each governorate at random
List all village and urban areas in each districts
Select a village and an urban centre from each district
randomly
Study all or sub-sample of individuals in the selected
villages and urban centres

Dr. Tarek Tawfik

II-Non-probability (convenience) samples






A convenience sample can minimize volunteerism and
other selection biases by consecutively selecting every
accessible person who meets the inclusion criteria.
A consecutive sample is specially desirable when it
mounts to taking the entire accessible population over
a long enough period to include seasonal variation or
other changes over time that considered important to
research question.
Representativness is a matter of judgment.

12/9/2013

Dr. Tarek Tawfik

Non-probability samples
These designs are used when the number of
elements in a population is either unknown or
can not be individually identified.
 Quota sampling.
 Accidental sampling.
 Judgmental or purposive sampling.
 Snowball sampling.

12/9/2013

Dr. Tarek Tawfik

Non-probability (convenience) samples
1-Purposive sample:
Chosen according to the investigator‟s judgement in
such a way that maximizes the chances of proving the
study hypothesis. “selecting patients with ESRD”
2-Quota sample:
Involves only few strata e.g. men and women >20
years. The enumerators select any individual belonging
to those strata from whom they get the required
information in an easy, quick and accessible way.

12/9/2013

Dr. Tarek Tawfik

Sample size
How many observations should we include?
The greater the sample size:
I.
The more precise are the estimates derived.
II.

III.

12/9/2013

The more powerful are the tests (probability of
rejecting a false null).
Larger degrees of freedom and smaller test
statistic required.

Smaller standard error.
Higher costs, more time and efforts needed.

Dr. Tarek Tawfik

Sample size
The size of the sample depends on:
1.
2.
3.
4.
5.
6.
7.
8.

Study design,
Maximum tolerable sampling error,
Homogeneity of the population,
Number of variables studied,
The extent of breaking down the data in analysis,
Cost,
Available staff, equipments, time and tools,
Statistical tests used.

12/9/2013

Dr. Tarek Tawfik

Data Collection Techniques and
Tools
Dr. Tarek Tawfik

12/9/2013

Dr. Tarek Tawfik

Objective of data collection
techniques:
Allow the investigator to systematically
collect data about the subjects under the
study including the setting in which
they were occur.

12/9/2013

Dr. Tarek Tawfik

Methods of data collection
Primary
Sources

Secondary
Sources

Documents

Observation
Participant

Interviewing
Structured

oGovt

publications
oEarlier research
oCensus
oPersonal records
oClient histories
oService records

Non-participant

Questionnaire
Mailed

Unstructured

Collective

Observation
 Participant: the researcher participates in

the

activities of the group being observed
“submitted to clinical examination to observe
practice of physicians”
 Non-participant: involved in the activities and
remains a passive observer “functions carried
out by nurses in a hospital”

12/9/2013

Dr. Tarek Tawfik

Problems with observation:
Hawthorne effect: change in behavior as a result
of the observation process.
Observer bias.
Inter-observer variation in interpretation.
Incomplete observation and /or recording “keen
observation with missing recording or vice
versa”.

12/9/2013

Dr. Tarek Tawfik

Recording of observation







Narrative: description of the process in the researcher‟s
own words “deeper insight in interpretation and
conclusions”.
Scales: interpreting in a form of rates using scales for
measurements. No in-depth interpretation, error of
central tendency and Halo effect.
Categorical recording: yes/no, always/sometimes/never.
Using mechanical devices: videotape “uncomfortable or
behave differently before a camera or cassette recorder.

12/9/2013

Dr. Tarek Tawfik

Scale “example”
Neutral

Positive
5

4

3

2

1

0

Aggressive behavior of nurses in hospital Z

12/9/2013

Dr. Tarek Tawfik

Negative

1

2

3

4

5

Interviewing

Different levels of flexibility and specificity.
Unstructured
Interviews
-Flexible interview structure.
-Flexible contents
-Flexibility in questions
In-depth interviews
Focus group discussion
Narratives
Oral histories

Structured
Interviews

-Rigid interview structure.
-Rigid contents
-Rigidity in questions and
their wording.
Interview schedule
Questionnaire

Techniques of data collection
 Using the available information (records and registries).
 Observing and recording using an observation check list.
 Interviewing (face to face)
 Self-administered questionnaire
 Telephone and net surveys.
 Focus group discussion.
 Measuring scales.
 Others (life histories, essay, case studies, and mapping).

12/9/2013

Dr. Tarek Tawfik

(advantages and disadvantages)
Technique
Records and
registries

Observation

Advantages
1.
2.

A.

B.

C.

Disadvantages

Inexpensive
Permit examination of
past trends.

1.

More detailed
information.
Facts not mentioned by
questioning
Test reliability

A.

2.
3.

B.
C.

D.

Accessible.
Non-ethical
Incomplete and
imprecise.
Ethical issues
Observer bias
Data collector may
influence results.
Need training.

Technique
Personal interviewing

Advantages
I.

II.
III.

Suitable for illiterates
Permits clarification
High response rate

Disadvantages
I.

II.

III.

Self administered
questionnaire

1.

2.
3.
4.

Less expensive
Permit anonymity
Less personnel
Eliminate bias

1.
2.
3.

Interviewer may influence
results
Less accurate recording than
observation
Needs trained personnel
Not suitable for illiterate
Low response rate
Problem of misunderstanding

Technique
Focus group
discussion

Advantages
Collection

of in-depth
information and
exploration

Disadvantages
1.

2.

3.
4.

Measuring scale

oPrecision
oEliminate

o

bias

o

Interviewer may
influence results
Open-ended questions
Domination
Non response
Training
Validity and accuracy

Differentiation between data collection
techniques and tools.
Techniques

Using available data
Observation





Interviewing



Self-administered
questionnaire

Tools
Data compilation sheet
Check list, eye, watch, scales,
Microscope, pen and paper.
Schedule, agenda, questionnaire,
recorder.



Questionnaire.

Designing Questionnaire and Data
Collection Instruments.
In many instances the validity of the results
depends on the quality of the data collection
instruments.

12/9/2013

Dr. Tarek Tawfik

Choosing between an interview schedule and a
questionnaire.
 Nature of the investigation: reluctant to discuss

“sexuality, drug use”.
 Geographical distribution of the study population.
 The type of study population. “illiterate, young,
handicapped, very old”.

12/9/2013

Dr. Tarek Tawfik

Administration of questionnaire.
Mailed or via other electronic media.
Collective administration “people attending
some function (schooling)”.
Administration in a public place “hospital,
medical center”.

12/9/2013

Dr. Tarek Tawfik

Advantages and disadvantages of questionnaire.
Advantages
1.
2.

Less expensive
Offers greater
anonymity

Disadvantages
1.
2.

3.
4.

5.

6.

12/9/2013

Dr. Tarek Tawfik

Application is limited
Response rate is low
Self-selecting bias
Opportunity to clarify is
lacking
Spontaneous responses are
not allowed.
Possible to consult others.

Advantages and disadvantages of interview.
Advantages
1.

2.

3.

4.
5.

More appropriate for
complex situations.
Collecting in-depth
information.
Information can be
supplemented.
Questions can be explained.
Has a wider application “any
type of population”

Disadvantages

1.

2.

3.

4.
5.

Time consuming and
expensive.
Quality of data depends
on the quality of
interaction.
Quality of data depends
on the quality of
interviewer.
Many interviewers
Interviewer bias.

Designing Good Questions and Instruments

Open-ended and Closed-ended Questions

Open-ended question:
Useful when it is important to hear what respondents
have to say in their own words;
What habits do you believe increase a person’s chance
of having a heart attack?
----------------------------------------------------------------------------------------------------------------------------------------------------It leave the respondent to answer freely without limits that may
imposed by the interviewer.

12/9/2013

Dr. Tarek Tawfik

Designing Questionnaire and Data Collection Instruments.

Open-ended questions:
A.

B.

Often used in exploratory phases of question design
because they facilitate understanding a concept as
respondent express it.
Phrases and words used by respondent can form the
basis for more structured items in a later phase.

Disadvantage:
Usually require qualitative methods of coding and
analyze the responses, which take more time and
subjective judgment than coding closed-ended
questions.
12/9/2013

Dr. Tarek Tawfik

Designing Questionnaire and Data Collection Instruments.

Closed-ended questions:
More commonly used, and form the basis for most standardized measures.
Ask the respondent to choose from one or more pre-selected answers;
Which one of the following do you think increases a
person’s chance of having a heart attack the most ?
(Check one)
Smoking
Being overweight
Stress
12/9/2013

Dr. Tarek Tawfik

Closed-ended questions:
They quicker and easier to answer.
The answers are easier to tabulate and analyze.
The list of possible answers often help to clarify the
meaning of the question.
Disadvantages:

It may lead the respondent, and do not allow them
to express their own, potentially unique answers.
ii. The potential responses listed may not include an
answer most appropriate for a particular
respondent.
i.

12/9/2013

Dr. Tarek Tawfik

Designing Questionnaire and Data Collection
Instruments.

Whenever there is a chance that the set of answers is
not exhaustive (does not include all the possible
options), include the option „Other (please specify)‟ or
„None of the above”
When a single response is desired, the set of possible
responses should be mutually exclusive „ the
categories should not overlap‟ to ensure clarity.
All that apply is used for multiple answer.

12/9/2013

Dr. Tarek Tawfik

The Visual Analog Scale


Used for recording the answers to closed-ended questions using
lines or other drawings.



The participant is asked to mark a line at a spot, along the
continuum from one extreme to another, that best represents
his characteristics.



It is important that the words that anchor each end describe the
most extreme value for the item of interest.



The line is 10 cm long and score is the distance, in cm from the
lowest extreme.

Visual Analog Scale for Rating the
Severity of Pain
4- please use an X to mark the place on this line that best describe the severity of
your pain in general over the past week

None

Unbearable

A participant might answer as follow

None

Unbearable

There is a 10 cm line, and the mark is 3 cm from the end (30 % of the distance from
none to unbearable) so the respondent‟s pain would be recorded as having a severity of
30 %.
12/9/2013

Dr. Tarek Tawfik

Formatting of questionnaire


It is customary to describe the purpose of the study and how
the data will be used in a brief statement on the cover together
with name of the institution, assure anonymity, contact
number for any questions, return address, deadline date and
thank them for participation. “the covering letter”



To ensure accurate and standardized responses, all instruments
must have instructions specifying how they should be filled
out.
Some time it is helpful to provide an example of how to
complete question, using a simple question that is easily
answered.



Formatting
 To improve the flow of the instrument, questions concerning
major subject areas be grouped together an introduced by
headings or short descriptive statements. “personal data include:
age, sex, educational status, marital status”
 To warm up the respondent to the process of answering
questions, it is helpful to begin with emotionally neutral
questions such as self-rated health of functioning.
 More sensitive questions can be placed in the middle.
 Questions about personal characteristics such as income or
sexual function are often placed at the end of the instrument.

Formatting
The visual design should be as easy as possible for the
respondent to complete all questions in the correct sequence.
With too complex format, the respondent or interviewer may
skip questions, provide wrong answers, and even refuse to
complete the instruments.
A plenty of space is more attractive and easier to use than one
that is crowded.
When open-ended questions are used, the space of responding
should be big enough to allow respondent with large
handwriting to answer comfortably.

12/9/2013

Dr. Tarek Tawfik

Formatting
People with visual problems, including elderly will
appreciate large type (font size 14), and high contrast (black
on white).
Possible answers to closed-ended questions should be lined
up vertically and preceded by boxes or brackets to check, or
by number to circle, rather than open blanks:
How many different medicines do you take every day?
(Check one)
None
1-2
3-4
5-6
7 or more
12/9/2013

Dr. Tarek Tawfik

Formatting
The Branched Question:
Sometimes the investigator may wish to follow up certain
answers with more detailed questions:
Respondent‟s answer to initial question (screener) determine
whether they directed to answer additional question or skip
ahead to later questions;
10- Have you ever been told that you have high blood pressure?
Yes
No
If yes, how old were you when you were first told that you had high blood pressure?
-------------- years old.
If no, go to question 11.

12/9/2013

Dr. Tarek Tawfik

Wording

Clarity, Simplicity, Neutrality
Every word in a question can influence the validity and
reproducibility of the responses.
•
•

Constructed question should be simple and free of ambiguity.
Encourage accurate and honest responses without embarrassing or
offending of the respondent.

12/9/2013

Dr. Tarek Tawfik

Clarity
o
o

Question must be as clear as specific as possible.
Concrete words are preferred over abstract words:

How much exercise do you usually get?
Is less clear than

“ during a typical week, how many hours do
you spend exercising (e. g., vigorous walking
or sports)?”
12/9/2013

Dr. Tarek Tawfik

Simplicity
Simple and common wording should be used to
convey the idea, avoid technical terms and jargon.
“ drugs you can buy without a doctor‟s prescription”.
Clearer than “over-the-counter medications”.

The sentences should also be simple, using the fewest
words and simplest grammatical structure.

12/9/2013

Dr. Tarek Tawfik

Neutrality
Avoid Loaded words and stereotypes that
suggest that there is a most desirable answer.

“During the last month, how often did you drink
too much alcohol”
“During the last month, how often did you drink
more than five drinks in one day”
Less Judgmental question.
12/9/2013

Dr. Tarek Tawfik

Neutrality
It is useful to set a tone that permits the respondent to
express behaviors and attitudes that may be
considered undesirable.

“ People sometimes forget to take medications their
doctor prescribed. Do you ever forget to take your
medications?”

12/9/2013

Dr. Tarek Tawfik

Avoid Pitfalls
I.
II.
III.

IV.

12/9/2013

Double-Barreled Questions.
Hidden assumptions.
The question and answer options do not
match.
Leading questions.

Dr. Tarek Tawfik

I- Double-Barreled Questions.
Each question should contain only one concept
:Or or And will lead to unsatisfactory responses.
“How many cups of coffee or tea do you drink
during a day?”.
In this case you should ask two questions to assess two
things.

12/9/2013

Dr. Tarek Tawfik

II- Hidden Assumptions.
“How many cigarettes do you smoke in a day?”

“What contraceptives do you use?”

12/9/2013

Dr. Tarek Tawfik

III-The question and answer options do not match.

“ Have you had pain in the last week”
The options are : (never, seldom, often, very often),
grammatically incorrect:
“ How often have you had pain in the last week?” or the
answer should change to (yes, no).

12/9/2013

Dr. Tarek Tawfik

The question and answer options do not match.
Question about intensity:

“ I am sometimes depressed” (agree) (disagree).
For those who are often depressed, it is unclear to
respond, disagreeing with this statement could mean that the
person is often depressed or never depressed.
(never, sometimes, and often) should be the options.

12/9/2013

Dr. Tarek Tawfik

IV-Leading questions
It is the one in which, contents, wording or
structure leads a respondent to answer in a
certain direction “judgmental questions”.
“Unemployment is increasing, isn‟t it?”
“Smoking is bad, isn‟t it?”

12/9/2013

Dr. Tarek Tawfik

Collecting data using attitudinal scales
Dr Tarek Tawfik

12/9/2013

Dr. Tarek Tawfik

Function of attitudinal scales
Attitudinal scales measure the intensity of
respondent‟s attitudes towards the various
aspects of a given situation or issue and provide
a techniques which combine the attitudes
towards different aspects into one overall
indicator.

To develop an overall picture out of various
opinions and perspectives.
12/9/2013

Dr. Tarek Tawfik

Developing a scale
1. Which aspects is going to be measured?
2. Procedures adopted to combine these aspects

to give an indicator for measurement?
3. The validity of such scale?

12/9/2013

Dr. Tarek Tawfik

Types of attitudinal scales

Summated rating
Scale
“Likert scale”

Differential scale
“Thurstone sclae”

The cumulative
Scale
“Guttman scale”

I-Likert Scale

12/9/2013

Dr. Tarek Tawfik

Basic Research
Designs
Dr. Tarek Tawfik

12/9/2013

Dr. Tarek Tawfik

Definition of a research design

o
o
o

o

A traditional research design is a blueprint or detailed
plan for how a research study is to be completedOperationalizing variables so they can be measured,
Selecting a sample of interest to study,
Collecting data to be used as a basis for testing
hypotheses and
Analyzing the results.

„Thyer 1993‟

12/9/2013

Dr. Tarek Tawfik

Types of study design (I)
Reference period

One

Two

Experimental

Three or more

Prospective

Nonexperimental

Longitudinal
Studies

Cross-sectional
Studies
Before and
after studies

Retrospective
Prospective

Semiexperimental

Study designs

Retrospective

Nature of
investigation

Classification base

Number of contacts

Research designs (II)
Did the investigator assign exposure “intervention”?
Yes

No

Observational study

Experimental study

Comparison group?

Random allocation?
Yes

No

Yes

Analytical study
Randomized
Controlled
Trial RCT

NonRandomized
Controlled
trial

Cohort
study
Exposure →outcome

No

Descriptive study

Direction?
Exposure and outcome
at the same time

Case-control
study
Exposure ←outcome

Cross-sectional
study

Phases and indications of basic study designs
Type of
study

Timing

Form

Crosssectional

Cross-sectional

Observational

Repeated
crosssectional

Cross-sectional

Cohort

Case-control

C.T

Action in past
time

Action in
present time

Action in future
time

Prevalence estimates

Collect
All
information

Reference

range
Current health status

Observational

Collect
Collect
Collect
All
All
All
information information information
Longitudinal
(prospective)

Longitudinal
(retrospective)

Longitudinal
(prospective)

Typical uses

Define cohort
and assess
risk factors

Assess
Risk
factors

trace

Observe
outcome

Apply
intervention

Prognosis

and natural

history
Etiology

Etiology particularly for
rare diseases

Define cases
and controls
(outcome)
follow

Experimental

over time

follow

Observational

Observational

Changes

Clinical

Observe
outcome

trials to assess
therapy
Trials to assess
preventive measures
Lab. experiments

Descriptive Studies
The Descriptive Pentad

Descriptive studies are „the first toe in the water‟

They concerned with and designed only to
describe the existing distribution of variables
without regard to causal or other hypotheses.
Good descriptive study should answer five basic
„Ws”.

12/9/2013

Dr. Tarek Tawfik

The Five Ws
Ws
Who has the disease?
What is the condition or
disease being studied?
Why did the condition or
disease arise?

Components
Age, sex, and other characteristics.
A clear, specific, and measurable case
definition is essential.
Descriptive studies often provide clues about
cause that can be pursued with more
sophisticated research designs.

When is the condition common Time provides important clues about health
events.
or rare?
Where does or does not the
disease or condition arise?
Geography has a huge effect on health.
So what? The implicit W relates to the public health effect.

Descriptive Studies

Deal with individual

Case report

Case-series
report

Cross-sectional
prevalence

Surveillance

Relate to the population

Ecological cor-relational studies

I- Case Report
The least publishable units in the medical literature.
o An observant clinician reports an unusual disease or
association which prompts further investigations with
more rigorous study design.
Example: benign hepatocellular adenoma and higho

dose contraceptive pills.
o

Not all case reports deal with serious health threats,
however, some simply enliven the generally drab
medical literature.

12/9/2013

Dr. Tarek Tawfik

What is the most probable diagnosis?

II-Case-series report
A case series report aggregates individual cases in one
report.

Sometimes, the appearance of several similar cases
heralds an epidemic.
Example: a cluster of homosexual men in Los Angeles

with a similar syndrome alerted the medical
community of HIV/AIDS epidemic in North America.

Case-series report is a major trigger for further

investigations compared to case report.
Can constitute the case group for a case-control study.
12/9/2013

Dr. Tarek Tawfik

III- Cross-sectional (prevalence) Studies.

Prevalence studies describe the health of populations.
Examples: Health and Nutrition Examination Survey
(HNES), and Censuses.
These studies provide a snapshot of the population at a
particular time.

Both exposure and outcome are identified at at one
point in time.
Particularly useful for estimating the point prevalence
of a condition in the population:

Point prevalence =

Number with the disease at a single time point
Total number studied at the same time point

Design of a Cross-Sectional Study
Defined population
Begin with

Gather data on exposure and disease

Exposed:
Have disease

Exposed:
Do not have
Disease

Not exposed:
Have disease

Not exposed:
Do not have
disease

End with four possible groups

Cross-sectional (prevalence) Studies.
Advantages
Low

costs.
No follow up is required.
Quick.

12/9/2013

Disadvantages
Only

association can be inferred “not
causation”.
Temporal sequence is difficult to ascertain
“exposure-outcome sequence”.
Incidence can not be estimated
“occurrence of new cases over time”.
Trend over time can not be identified
“change of magnitude/pattern over time”.

Dr. Tarek Tawfik

Assignment:
o

o

o

The New Valley Governorate is located in the Western desert
of Egypt; several reports had described a grade II goiter among
primary school children, little is known about the prevalence,
socio-demographic characteristics of the condition.
Some clinicians have proposed observing a large number of
cases of renal failure in the Manzala region at the Northern
cost of Nile delta, the prevalence and distribution of which
are lacking.
Little is known about the magnitude of extra pulmonary
tuberculosis in Egypt.
According to the previous given data give the most
appropriate study design?

IV-Repeated cross-sectional studies
“Longitudinal study”
 Studies that may be carried out at different time points to assess
trends over time.
 These studies involve different groups of individuals at each
time point.
 It can be difficult to assess whether apparent changes over
time simply reflect differences in the group included in
the study rather in the condition itself.

12/9/2013

Dr. Tarek Tawfik

Longitudinal study design.

Study population

Study population

Interval
Data collection

Study population

Interval
Data collection

Data collection

Study population

Interval
Data collection

Disadvantages:
1. Maturation effect „maturation of responses in young subjects.
2. Reactive effect „instrument educates the respondents‟
3. Regression towards the mean „shift of extreme attitudes and behavior towards the
average‟.
4. Conditioning effect „repeated contacting with same persons‟
12/9/2013

Dr. Tarek Tawfik

V- Surveillance
The ongoing systematic collection, analysis, and
interpretation of health data essential to the
planning, implementation, and evaluation of
public health practices, closely integrated with
timely dissemination of these data to those
who need to know.

Passive
Data gathered through the
traditional channels e.g.,
death certificates

Active
Searching and reporting cases.

VI-Ecological Correlational Studies







Look for associations between exposures and outcomes
in the population rather than in individuals.
Can be a convenient initial search for hypotheses as the
data are already collected.
Correlation coefficient r, which indicates how linear is
the relation between exposure and outcome.

The mortality of coronary heart disease correlates with
per capita sales of cigarettes.

 Inverse correlation between access to safe abortion and

maternal mortality rate.

Consumption of dietary fat and fast food
in certain community.

Ecological study

High mortality from coronary heart disease
(high incidence of MI)

Ecological Correlational Studies




The inability to link exposure to outcome in
individuals.
Controlling of confounders.
are the two major limitations of this type of study.

Death rates from coronary heart disease is positively
correlated with number of color television sets per
capita????

12/9/2013

Dr. Tarek Tawfik

VII- Before-and After study design.
“pre-test/post-test design”
 The most appropriate design for measuring the impact

of effectiveness of a program.
 Described as a two sets of cross-sectional data
collection on the same population to find out the
change in the phenomenon or variables between two
points in time.
 The change is measured by the difference change
before and after the intervention.
 It could be experimental or non-experimental.
 Commonly used in evaluation studies.
12/9/2013

Dr. Tarek Tawfik

Program/intervention

Study population

Study population

Time
Before/pre observation
Data collection
Actual or recall

After/post
Data collection

Disadvantages
® Two sets of data collection, more expensive and

more difficult to implement.
® Time lapse may cause attrition of participants.
® It only measures total change without ruling
out the role of other variables “confounders”
® Maturation of the response of young
participants “maturation effect”
® Reactive effect
® Regression effect.
12/9/2013

Dr. Tarek Tawfik

Uses of Descriptive Studies
Trend analysis.

Planning

Clues about

cause

Monitor health of the population, provided by ongoing
surveillance: epidemic syphilis in USSR, international
epidemic of multiple births, prematurity, caused by assisted
reproductive technologies.
Health services: Laparoscopy, introduction of Anti
HIV/AIDS therapy.

Development of hypotheses: retrolental hyperplasia, and
painted radium dial watches.

Descriptive Studies.
Overstepping of the data:
Post hoc inference, a temporal association is
incorrectly inferred to be a causal one.

Intake of 6 cups of coffee /day is associated with
lower risk of colonic cancer!!!!

 The role of the media,
 The damage in the control efforts,
 Damage to the public health.

Research design in relation to time
Now

Exposure

Outcome
Concurrent

Exposure

Outcome
Retrospective

Exposure

Time

Outcome
Prospective

Finding Your Way in the Terminology Jungle
Case-control study
Cohort study
Concurrent cohort study
Retrospective cohort study
Randomized trial
Cross-sectional study

12/9/2013

=
Longitudinal study
Prospective cohort
Historical cohort
=
=

Dr. Tarek Tawfik

Retrospective study
Prospective study
Concurrent prospective
Non-concurrent prospective
Experimental study
Prevalence study

Experimental or Observational Study








Experimental studies involve the investigator intervening in
someway to affect the outcome.
Clinical trial is an example of an experimental study in which
the investigator introduces some form of „treatment, vaccine,
new surgical procedure, change in the health policy or
introduction of behavioral interventions‟.
Other examples include animal studies or laboratory studies
that are carried out under experimental conditions.
These studies provide the most convincing evidence for any
hypothesis as it can possibly control confounders.

Experimental or Observational Study






Observational studies „cohort or case-control‟ studies
are those in which the investigator does nothing to
affect the outcome, but simply observes what
happens.
These studies provide poorer information than the
experimental studies because it is often impossible to
control for all factors that may affect the outcome
„confounders‟.
Epidemiological studies which assess the relationship
between factors of interest and disease in the
population, are observational.

Observational (Analytical)
Studies.

12/9/2013

Dr. Tarek Tawfik

Bias and Casual Associations in Observational
Research.
I-Validity and Reliability

12/9/2013

Dr. Tarek Tawfik

Definitions : Validity

*Internal validity: the ability of the tool/test to measure what it sets
out to measure.
The inference from participants in a study should be accurate,
avoiding systematic errors and bias. Wrong extrapolation to the
general population is potentially dangerous.

** External validity: can results from study participants be
extrapolated to the reader‟s patients?
Including the results into the clinical practice.

II-Bias
Bias in research denotes deviation from the truth.
(when there is systematic difference between the results from
study and the truth).

All observational studies and badly done randomized controlled
trials have built-in bias.
The most often used classification of bias includes:
I.
Selection bias,
II.
Information bias,
III. Confounding.

12/9/2013

Dr. Tarek Tawfik

I- Selection Bias
Are the groups similar in all important respects?
Selection bias stems from absence of comparability
between groups being studied.

In a cohort study, are participants in the exposed and
unexposed groups similar in all important respects except
for exposure?

In case-control study, are cases and controls, similar in all
respects except for the disease in questions?

12/9/2013

Dr. Tarek Tawfik

Selection Bias
Bias accompanying case-control study:

Berkson bias (admission-rate bias): knowledge of the
exposure of interest might lead to an increased rate of
admission to hospital. Admission preference of disease of
interest.

Neyman bias (an incidence-prevalence bias): arises when a
gap in time occurs between exposure and selection of study
subjects. This bias crops up in studies of diseases that are
quickly fatal, transient, or sub-clinical.
Myocardial infarction and its relation to snow shoveling.
12/9/2013

Dr. Tarek Tawfik

Selection Bias
 Unmasking bias:

An exposure might lead to provoking of an outcome.

Estrogen replacement therapy and symptomless endometrial
cancer.
 Non-respondent bias:

In observational studies, non-respondents are different from
respondents.

Smokers are less likely to return questionnaires than are nonsmokers or pipe and cigar smokers.

12/9/2013

Dr. Tarek Tawfik

II- Information Bias
Has the information been gathered in the same way?

Also known as observation, classification

or measurement bias, results from
incorrect determination of exposure or
outcome or both.

Information should be gathered in the

same way in any comparative study.

II- Information Bias
Has the information been gathered in the same way?

Sources:
 Differentials in information gathering:
(bedside for cases while using telephone for control).
 Diagnostic suspicion bias:
(intensive search for HIV in drug addicts).
 Family history bias:
Medical information flows differently to affected and nonaffected family members (rheumatoid arthritis).

12/9/2013

Dr. Tarek Tawfik

Information Bias
Recall bias: cases are motivated to search their
memories in order to identify the cause of their
illness than the healthy people.
Observer bias: one observer consistently under or
over reports a particular variable. Meticulous
observation of those who are exposed than the
non-exposed.

12/9/2013

Dr. Tarek Tawfik

Information Bias control
 Observer and data gatherer should be blinded.
 Using a standardized instruments for data
collection,
 Proper selection of the subjects are the possible
maneuvers to lower the information bias.

12/9/2013

Dr. Tarek Tawfik

III- Confounding.
Is an extraneous factor blurring the effect?

A confounding variable is associated with the exposure and it
affects the outcome, but it is not an intermediate link in the
chain of causation between exposure and outcome.
Oral contraceptive

Myocardial infarction

Smoking
IUD insertion

Salpingitis

STDs

Confounding „Control‟
 Restriction (exclusion or specification):
Enrollment with restricted selection criteria, including nonsmokers.
 Matching:
A pair wise matching (for every case who smokes, a control who
smokes is found).
 Stratification:
Used after completion of the study. Results can be stratified by
the levels of the confounding factor.
 Multivariate analysis techniques:
logistic regression, proportional hazard regression, and others.
12/9/2013

Dr. Tarek Tawfik

Judgment of Associations
Bogus, indirect, or real?
Statistical associations do not imply causal associations.
Types of associations:
 Bogus or spurious associations:
Results of selection, information bias and chance.
 Indirect association:
Stems from confounding.
 Real associations.

12/9/2013

Dr. Tarek Tawfik

Hill‟s Criteria for Real Associations
Temporal sequence:
Did exposure precede outcome? the cause must antedate the
outcome.

Strength of association:
How strong is the effect, measured as relative risk (>3 ) or odds
ratio (> 1)?

Consistency of association:
Has effect been seen by others? In different populations with
different study designs.

12/9/2013

Dr. Tarek Tawfik

Biological gradient (dose-response relationship):
Does increased exposure result in more of the outcome?
Lung cancer and years of cigarette smoking.

Specificity of association:
Does exposure lead only to outcome?
“weak criterion, few exposure will only lead to the outcome”.

Biological plausibility:
Does the association make sense?
“weak criterion, limited by our lack of knowledge”.

12/9/2013

Dr. Tarek Tawfik


Coherence with existing knowledge:
Is the association consistent with available evidence?
The effect of cigarette smoke on the bronchial epithelium of
animals is coherent with an increased risk of caner in human.

Experimental evidence:
Has a randomized controlled study been done?

Analogy:
Is the association similar to others?
12/9/2013

Dr. Tarek Tawfik

Case-control Design
Research in Reverse
Dr. Tarek Tawfik

12/9/2013

Dr. Tarek Tawfik

Examples of Topics Investigated with Case-control Studies

Exposure
Cat ownership in childhood
Body mass index
Physical disability
Hiatus hernia
Hair dyes
History of shingles
Pig farming
Ghee applied to umbilical cord
Pickled vegetables
Digital rectal examination
Statins for lipid lowering
Paracetamol use
Phyto-estrogens
Male condom use
Physical activity
Sigmoidoscopy screening
Influenza vaccination

Outcome
Schizophrenia, schizoaffective disorder, or bipolar disorder
Pancreatic cancer
Earthquake mortality
Reflux oesophagitis
Connective tissue disorders
Systemic lupus eryhtematosus
Nipah virus infection
Neonatal tetanus
Esophageal cancer
Metastatic prostate cancer
Dementia
Ovarian cancer
Breast cancer prevention
Genital warts
Ovarian cancer
Colon cancer
Recurrent myocardial infarction prevention

Case-Control Studies
Structure
A

case-control study compares the characteristics of a
group of patients with a particular disease outcome
(the cases) to a group of individuals without a disease
outcome (the control), to see whether any factors
occurred more or less frequently in cases than the
controls.
 Such retrospective studies do not provide information
on the prevalence or incidence of disease but may
give clues as to which factors elevate or reduce the
risk of disease.
12/9/2013

Dr. Tarek Tawfik

Basic structure of case-control design

Population
Diseased

Unexposed to factor
(b)

Diseased
(cases)
Sample

The Odds “chance of exposure
Is calculated between both groups

Exposed to factor
(a)

Disease-free

Exposed to factor
(c)

Disease-free
(controls)

Unexposed to factor
(d)

Trace
Past time

Present time

Starting point

Calculate the difference in Odds for
the included exposures for comparison.

Calculate the difference in Odds for
the included exposures for comparison.

Selection of Cases
Cases

Incident cases
Patients who are recruited
at the time of diagnosis

1. Less recall bias
2. Less altered behavior
3. But, we have to wait to
be diagnosed

Prevalent cases
Patients who were already
diagnosed before entering the study

1. Recall bias
2. Altered behavior
3. Risk factors may be
related more to survival

Selection of Cases

Hospital patients

Patients in Physician‟s
practices

Clinic patients

Problems:
* Single or multiple hospitals;
Some hospitals have an aggregation
of certain risk factors than others.

* Tertiary Health Care Facility;
A tendency to select severely ill
cases, any risk factors identified
may be only found in these severe
forms of the disease.

Selection of Controls
Non-hospitalized
persons

Community-based
Probability sample
School rosters
Selective service list
Insurance company list

Neighborhood controls:
Door-to-door approach
Or random digit dialing

(Socio-economic, cultural)

Hospitalized
persons

Best-friend control:
Similarity in demographic
Characteristics

(lifestyle pattern)

Spouse or sibling controls:
Sibling control may provide
Some control over genetic
Difference between
Cases and controls

Captive population:
They represent a
sample of ill population.
Hospital patients are
differ from people in
the community.

A sample of all other
patients, admitted
or to select a specific
other diagnoses?

Problems in Controls Selection



When a difference in exposure is observed between
cases and controls,
We must ask whether the level of exposure observed
in the controls is really the level expected in the
population in which the study was carried out or
whether-perhaps (due to the manner of selection)-

The controls may have a particularly high or low
level of exposure that might not be representative of
the level in the population in which the study was
carried out.
12/9/2013

Dr. Tarek Tawfik

Distribution of Cases (cancer pancreas) and Controls by Coffeedrinking Habits and Estimates of Risk Ratios
Coffee consumption (cups/day)
Sex

Category

Male

Females

0

1-2

3-4

>5

Total

No. of cases
No. of controls
Adjusted RR
95 % CI

9
32
1.0
-

94
119
2.6
1.2-5.5

53
74
2.3
1.0-5.3

60
82
2.6
1.2-5.8

216
307
2.6
1.2-5.4

No. of cases
No. of controls
Adjusted RR
95 % CI

11
56
1.0
-

59
152
1.6
0.8-3.4

53
80
3.3
1.6-7.0

28
48
3.1
1.4-7.0

151
336
2.3
1.2-4.6

Estimates of Relative Risk of Cancer of the Pancreas Associated
with use of Coffee and Cigarettes
Coffee drinking (cups/day)

0

1-2

>5

Total

Never smoked
Ex-smokers
Current smokers

1.0
1.3
1.2

2.1
4.0
2.2

3.1
3.0
4.6

1.0
1.3
1.2
(0.9-1.8)

Total “RR/95% CI”

1.0

1.8
(1.0-3.0)

2.7
(1.6-4.7)

Cigarette smoking
status

Matching


The process of selecting controls so that they are similar to
the cases in certain characteristics, such as age, race, sex,
socioeconomic status, and occupation.



To nullify the difference in characteristics or exposures
other than that has been targeted for study.

12/9/2013

Dr. Tarek Tawfik

Types of Matching
Group Matching

(frequency)
Selection of controls:
Proportion of controls
with certain characteristics
identical to proportion of
cases; 25% of cases are
married, then 25 % of
controls are married.

All cases should be
selected first, and calculation
of proportions are made.

Individual Matching

(matched pairs)
For every case included
an identical matched
control should be selected;
45 year old white female
case, we seek for 45 year
white female control.
used in hospital-based
case-control studies

Problems with Matching
Practical problems
Matching of too many
characteristics is very
difficult or impossible to
identify an appropriate
control.
A 48-years old black
female, married, has 4
children, lives in zip
code 21209, and work in
photo-processing plant
Find her control?

Conceptual problems
Once we have matched controls to
cases to a given characteristics,
we can not study that
characteristics.
Marital status and cancer breast, if
matching occur as regard
marriage, we can not be able to
study of that factor „marital
status‟. Why?
Matching ensures the same
prevalence of that characteristic
in both cases and controls.

Uses of Multiple Controls
In case-control studies we usually use more than
one control per case to increase the power of
the study.

12/9/2013

Dr. Tarek Tawfik

1-Multiple controls of the same type.
The power of the study is increasing by including more controls
for each case up to 4 controls per case.
Why not keep the ratio of controls to cases 1:1 and just increase
the number of cases?
1.
For many rare disease „cancer, connective tissue disorders‟ the
number of the cases are limited for study.
2.
In addition, with the limited time frame of the study that
does not allow more inclusion of cases and
3.
In the absence of multi-centric collaboration, the option
remained is to increase the number of controls.
12/9/2013

Dr. Tarek Tawfik

2-Multiple Controls of Different Types

The use of hospital and neighborhood controls:


To assess the level of exposure among the different
controls group in relation to the cases.



Comparing cases with hospital controls, then cases to
neighborhood controls to assess discrepancy in the
level of exposure, and if present, the reason should be
thought.

12/9/2013

Dr. Tarek Tawfik

Nested Case-Control Studies
Population
(Cohort)
Time

Develop disease

Cases

Initial data and/or
specimen obtained

Do not develop
Disease

Subgroup
Selected as
controls

Advantages of Nested Case-Control Design

Interviews are performed at the beginning of the study
(baseline), the data are obtained before any disease has develop,
the problem of possible recall bias is eliminated.
If abnormalities in biologic characteristics are found „specimens
obtained years before the development of clinical disease‟ , it is
more likely that these findings represent risk factors or other
pre-morbid characteristics than a manifestation of early, subclinical disease.

Temporal association can not be concluded from the ordinary
case-control design.

More economical to conduct.

Assignments:
 The risk factors for end-stage renal disease are largely

unknown, describe a study to identify such factors?
 The prevalence of iodine deficiency disorders showed a
geographic discrepancy between Jeddah and Qaseem, mention
a design to explore such discrepancy.
 Cross-sectional study reported a difference in the dietary fat
intake among obese subject, how to confirm such difference?

12/9/2013

Dr. Tarek Tawfik

Cohort Study Design
Dr. Tarek Tawfik

12/9/2013

Dr. Tarek Tawfik

Cohort study
(marching towards outcomes)
 The

term cohort has military, not medical roots.
 A cohort was a 300-600-man unit in the Roman
army, ten cohorts formed a legion.
 A cohort study consists of bands or groups of
persons marching forward in time from an
exposure to one or more outcomes.

12/9/2013

Dr. Tarek Tawfik

Basic Structure of cohort study
Diseased

Disease-free

The Relative Risk is calculated for exposure
Develop
Disease (a)

Sample

Exposed
to factor

Develop
Disease (c)

Diseasefree
Unexposed
to factor

Disease-free
(d)

Future time

Present time
Starting point

Disease-free
(b)

Follow

Comparing the incidence of disease in each group

Population

Incidence of
cancer lung
Cohort

Incidence of
cancer lung

12/9/2013

Dr. Tarek Tawfik

Direction
Time

Prospective

Exposure

Exposure

Outcome

Retrospective
Outcome

Exposure

Exposure

Outcome

Time

Short/long term effects
Outcome

Ambi-directional

Design of Cohort
Then follow to see whether
Disease
develops

Exposed

Disease does not
develop

Totals

a

b

a+b

First
select

Not exposed

c

d

c+d

Incidence
rate of
disease

a
a+b
c
c+d

Data collection in cohort: forwards and backwards
A cohort study follow-up two or more groups from
exposure to outcome.

In the simplest form, it compares the experience of a
group exposed to some factor with another group not
exposed to that factor.
The frequency of the outcome „whether higher or
lower‟ in relation to the unexposed, will gives the
evidence of association between exposure ad outcome.

In general, the cohort should always moves in the
same direction, although the data gathering might not.

Cohort versus Randomized Trials
Both types compare exposed with non-exposed groups (or a
group with a certain exposure to a group with other exposure).
Because of ethical and other reasons, we can not randomize
people to receive a putatively harmful substance
(carcinogens), the exposure in RCTs is often a treatment or
preventive measure.
In cohort studies investigating etiology “exposure” is often to a
toxic or carcinogenic agent.

The difference between the two design is the presence or
absence of randomization which is critical in interpreting the
study findings.

Selection of Study Population
Comparison of outcomes in an exposed group
and non-exposed group (or a group with a certain
characteristic and a group without)
Select a defined population before
any of its members become
Create a study Population by
exposed or before their exposures
selecting groups for inclusion
are identified selection by
on the basis of whether or not
factor not related to exposure
they were exposed
(residence),
In both cases we
(occupationally exposed
took histories
wait for the
cohorts)
or tests and then
outcome
separate into exposed
and non-exposed

Types of Cohort Studies

(concurrent prospective)
Concurrent 2000

Using a defined population
(smoking and lung cancer), population of
elementary school children.

Non randomized
2010

Exposed (smoke)

Disease

No disease

Non-exposed (non-smoker)

Disease

No disease

2020
Time frame for a hypothetical concurrent cohort study begun in 2000

Types of Cohort Studies
Retrospective Historical

Retrospective 1980

Defined population (old roster of elementary
School children found)

Non randomized

1990

2000

Exposed (smoke)

Disease

No disease

Surveyed for
smoking habit

Non-exposed (non-smoker)

Disease

No disease

Time frame for a hypothetical retrospective cohort study begun in 2000

Advantages of Cohort Design
I.

II.

III.

IV.

V.

The best way to ascertain both incidence and natural
history of a disease (the temporal sequence between the

putative cause and outcome is usually clear).
Useful in investigation of multiple outcomes that might
arise after a single exposure (sometimes misleading).
Useful in the study of rare exposures.
Reduce the risk of survival bias (diseases that are rapidly
fatal are difficult to study because of this factors).
Allow calculation of incidence rates, relative risks, and

confidence intervals.
VI.

Other outcome measures include life table rates, survival

curves and hazard ratios.

Potential Biases in Cohort Studies
1)

2)

Bias in assessment of the outcome (blinding or
masking is used to avoid).
Information bias (particularly in historical or
retrospective cohort).

3)

Bias from non-response and losses to follow-up
(attrition).

4)

Analytic bias (blinding is needed).

12/9/2013

Dr. Tarek Tawfik

When Is A Cohort study Warranted?
A.

B.

C.

When a good evidence suggests an association of a
disease with a certain exposure (from clinical
observations or case-controls or other types of
studies).
When are able to minimize attrition of the study
population.
When the interval between exposure and
development of outcome is relatively short.

12/9/2013

Dr. Tarek Tawfik

What To Look For In Cohort Studies
Who is at risk?

All participants in a cohort study must be at risk of
developing the outcome.

Who is exposed?

Clear, unambiguous definition of exposure at the outset
is required (sometimes quantifying the exposure by
degrees, rather than yes/no).

Who is an appropriate
Unexposed should be similar to the exposed in all
control?

aspects except for the exposure. Either internal or
external sources. The healthy worker effect.

Have outcomes been
assessed equally?

Outcomes must be defined in advance; should be
clear, measurable and specific.

Reporting of Cohort Studies









The first table in reports should provides demographic and
other prognostic factors for both groups with hypothesis testing
(P value), to show the likelihood that observed differences
could be due to chance.
For dichotomous outcome measures (sick/well), provide raw
data sufficient for the reader to confirm the results.
For cumulative incidence: calculate the proportion who develop
the outcome during the specified study interval.
For incidence rates, the value is expressed per unit of time.
The relative risks, and confidence intervals should be provided.

Use of P values should not replace interval
estimation (relative risk with confidence).

How to Choose the Study Design?
Study Design

Cross-sectional
Case- Control

Selection of
subjects by
status

Information
collected on
Exposure

Information
collected on
Disease

No

Current

Current

Disease

Past

Current

Cohort:


Prospective

Exposure

Current

Future



Retrospective

Exposure

Past

Current

How to Choose the Study Design? (cont.)
Case-Control

Concurrent
Cohort

Retrospective
Cohort

Study time

Short

Long

Short

Cost

Low

High

Low

Rare diseases

Yes

No

No

Sample Size

Small

Large

Large

Loss to follow up

No

Yes

Yes

Incidence

No

Yes

Yes

Approx.

Yes

Yes

Options

Relative Risk

Experimental study
design
Dr Tarek Tawfik

12/9/2013

Dr. Tarek Tawfik

Experimental study designs
Experimental studies
Treatment /
Intervention/
Program

Exploration
Randomization

Study
population

Causes/associations

Non

Exploration

Outcome/
Impact/
Change
Effect

Non-experimental studies
Experimental: starts from the cause to effect.
Non-experimental: starting from the effects to trace the cause.
Semi (Quasi) experimental: a mix of both.

The concept of Randomization
Randomization

Study
population

Group A

Or

Randomization

Or

Study
population

Group B

Any individual or unit of study population has an equal and independent
chance of becoming a part of an experimental or control group, or in the case
of multiple treatment modalities, any treatment has an equal and independent
chance of being assigned to any of the population groups.

The control group design
“the control experimental design”
Independent variable

Experimental group
Study
population

Intervention arm

Study
population

Study
population

No intervention
Control group

Study
population

Baseline
Data

Measuring
dependent
variables
“outcome”

The chief objective of the control group is to quantify the impact of extraneous factors
“possible confounders”, which help to ascertain the impact of the intervention only.

The placebo design
 A patient‟s belief that is receiving treatment can

play an important role in recovery from an
illness even if treatment is ineffective

“psychological effect known as placebo effect”
 The placebo design attempts to determine the

extent of this effect.

The placebo design
Experimental
Group

Treatment/placebo/
confounders

Treatment
Placebo/
confounders

Placebo
Group

Placebo

Placebo
Group

Treatment+
Placebo

(-)

Treatment+
Confounders

Placebo
Confounders

Control
Group

Experimental
Group

Control

Control
Group

Treatment
Outcome
Confounders

(-)

Cross-over comparative design
 Denial of treatment to the control group is considered

unethical.
 Denial of treatment may be unacceptable to some
individuals in the control group, which could result in
drop out of cases.
 The cross-over design experimental design makes it
possible to measure the impact of a treatment without
denying treatment to any group.
 Design is based upon the assumption that participants
at different stages are similar in terms of their
characteristics and the problem for which they are
seeking intervention.
12/9/2013

Dr. Tarek Tawfik

Cross-over experimental design
Outcome
Drug A

Outcome
Drug A

Non

Non

Outcome

Outcome

Study
population
Placebo

Placebo

Non
Washout
Period

Non

Meta-analysis and systematic
review.

12/9/2013

Dr. Tarek Tawfik

Estimating Risk
Dr. Tarek Tawfik

12/9/2013

Dr. Tarek Tawfik

Absolute Risk
The incidence of a disease in a population is termed absolute risk.




Can indicate the magnitude of the risk in a group of people with
a certain exposure, but:
It does not take into consideration the risk of disease in the nonexposed individuals,
It does not indicate whether the exposure is associated with an
increased risk of disease.

Absolute risk doe not stipulate an explicit comparison.
Rubella in 1st trimester: what is the risk that my child will be
malformed? Abortion will be decided on the basis of this
information.
12/9/2013

Dr. Tarek Tawfik

Determination that a certain disease is associated
with a certain exposure.
By using the case-control and cohort studies we can assess
whether there is an excess risk of disease in persons who have
been exposed.
We have to compare the different risks among different groups
to assess the presence of excessive risk (by calculating the
incidence rate „attack rates‟ and the difference in the risks).

So, estimation of relative risks are vital in determining who
will be at a higher risk following the exposure.
12/9/2013

Dr. Tarek Tawfik

Relative Risk (concept)
o

Both case-control and cohort studies are designed to
determine whether there is an association between
exposure to a factor and development of a disease.

If an association exists, how strong is it?
o

If we carry out a cohort study, we can put the question
another way: what is the ratio of the risk of disease in

exposed individuals to the risk of disease in nonexposed individuals? This ratio is called the relative
risk.
Relative risk =

Risk in exposed
Risk in non-exposed

Interpreting the Relative Risk
(measure the strength of the association)
If RR = 1

If RR > 1

If RR < 1

Risk in exposed equal to risk in nonexposed (no association).
Risk in exposed greater than risk in nonexposed (positive association; possibly
causal).
Risk in exposed less than risk in nonexposed (negative association; possibly
protective).

Calculating the Relative Risk in Cohort
Studies
Then follow to see whether
Disease develops

a
First
select

a+b

b

c

d

Totals

a+b

Incidence rate of
disease
a
a+b

Exposed
No exposed

a

Disease does not
develop

= incidence in exposed

c
c+d

c+d

c
c+d

= incidence in non-exposed

Hypothetical Cohort
3,000 smokers and 5,000 non-smokers to investigate the relation of smoking to
the development of coronary heart disease (CHD) over a 1-year period.
Develop CHD

Do not develop
CHD

Totals

Incidence per
1,000/year

Smoke cigarettes

84

2,916

3,000

28.0

Do not smoke
cigarettes

87

4,913

5,000

17.4

Incidence among the exposed=
84/3,000 = 28.0 per 1,000

Relative risk =
Incidence in exposed
Incidence in non-exposed =

Incidence among the non-exposed
= 87/5000 =17.4 per 1,000

28.0/17.4 = 1.61

Example: the British Heart Study
A large cohort study of 7735 men aged 40-59 years
randomly selected from general practices in 24 British
towns, with the aim of identifying risk factors for ischemic
heart disease. At recruitment to the study, the men were
asked about a number of demographic and lifestyle,
including information on cigarette smoking habits.
Of the 7718 men who provided information on smoking
status, 5899 (76.4 %) had smoked at some stage during their
lives (including those who were current smokers and those
who were ex-smokers).
Over subsequent 10 years, 650 of these 7718 men (8.4 %)
had a myocardial infarction (MI).
12/9/2013

Dr. Tarek Tawfik

MI in subsequent 10 years
Yes

No

Total

Ever smoked

563 (9.5%)

5336 (90.5%)

5899

Never smoked

87 (4.8%)

1732 (95.2%)

1819

Total

650 (8.4%)

7068(71.6%)

7718

Smoking status at baseline

The estimated relative risk=
(563/5899)
(87/1819)
= 2.00
CI = 1.60-2.49
(does not include 1)

The middle aged man who has ever
smoke is twice as likely to suffer a
MI over the next 10 years period as
a man who has never smoked.

The Odds ratio (relative odds)


In order to calculate a relative risk, we must have values for
the incidence in the exposed and non-exposed, as can be
obtained in the cohort study.



In a case-control study, however, we do not know the
incidence in the exposed population or the incidence in the
non-exposed population because we start with diseased people
(cases) and non-diseased people (controls).



Hence, we can not estimate the RR in case-control study
directly and we implement another measure of association
called Odds ratio.

Defining the Odds ratio in Cohort and in casecontrol studies.
Suppose we betting on a horse named Little Beauty, which has a
60% probability of wining the race (P). Little Beauty, therefore has a
40 % probability of losing (1-P). What are the odds that the horse
will win the race?
The odds is defined as: the ratio of the number of ways the event can
occur to the number of ways the event can not occur.
Odds =

Probability that Little Beauty will win the race
Probability that Little Beauty will lose the race

Odds = P/(1-P) or 60 %/40 % = 1.5:1 = 1.5
Probability of wining is 60 %, while the odds of wining is 1.5 times.
12/9/2013

Dr. Tarek Tawfik

Research Methods

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (20)

Semelhante a Research Methods

Semelhante a Research Methods (20)

Mais de Tarek Tawfik Amin

Mais de Tarek Tawfik Amin (20)

Último

Último (20)

Research Methods