2. Study design
Research question
Target population
Study design
Sampling frame
Data collection
tools
Data collection
methods
Generalisability
of findings
Sampling selection
3. Sampling – what is it?
Selection of a smaller
number of units from a
larger group
In research aim to enable
generalisation to a target
population
4. Why do we sample?
Not possible to study ALL people in a
population
Feasible and realistic financially to study
smaller subset of a population
Unethical if sample is larger than necessary
(overpowered)
1.
2.
3.
Aim to provide an accurate representation of
the target population
Allows for generalisation from sample to broader
population
Need to minimise sampling error and bias
6. How big should a sample be?
Sample size calculations to determine required
size
Based on variables to be measured - expected
difference, expected response rates, cluster effect,
attrition etc
Small sample size
Larger sample size
Less
likely that sample is representative of target
population
Limited POWER to detect ‘effect’
More
likely that sample is representative of target
population
Increased POWER to detect ‘effect’
8. 1. Simple Random Sampling
Subset of individuals chosen from a list of
individuals from the broader population
(sampling frame)
Each individual chosen at random
all subjects have equal chance of being selected
Most likely to achieve sample representative of
population (least selection bias)
May be difficult to achieve in practice
Not ideal for special interest groups/ population
minorities
10. 2. Systematic sampling
Units sampled at regular intervals
Width
of intervals randomly determined
inadequate sampling of rare individuals who
may be of interest
chance that random dispersion is “unlucky” and
inadequate
Researcher
pattern
must ensure sampling does not hide
13. 3. Stratified sampling
Population divided into subgroups prior to
sampling
To ensure adequate numbers of subjects from
subgroups are included
e.g.
male and female subgroups
Then simple random sample the individuals
among male group and then female group
14. Target population – Brisbane households
Sampling frame – electoral roll
Sampling frame –
electoral roll MALES
Sampling frame –
electoral roll FEMALES
SAMPLE
15. 4. Cluster sampling
Total population broken down into ‘groups’ or
‘clusters’
Number of clusters then randomly selected
from all eligible clusters
All
individuals in each selected cluster become
potential subjects.
16. 4. Cluster sampling
One-stage cluster sampling
Clusters are selected randomly
All individuals within clusters are invited to participate
in the study
Two-stage cluster sampling
Clusters are selected randomly
Lists of all elements within clusters are obtained random samples drawn from lists
17. Cluster sampling - example
Simple Random Sampling
Stage 1
Stage 2
All Schools
in Brisbane
School A – all students
School B – all students
Random sample
Random sample
18. 5. multi-stage sampling
Complex form of cluster sampling
Population
divided into clusters and sub-clusters
Used when selecting from very large
population
19. Nationwide retail chain
random selection of region
Region 1
Region 2
random selection of stores
Store 1
Store 2
Store 1
Store 2
Stratified sampling
Male
Female
Male
Female
Male
Female
Male
random selection
20
20
20
20
20
20
20 20
Female
20. Non-probability sampling
Sampling techniques that do not rely on random
selection
When sampling frame not able to be identified e.g.
visitors to a particular internet site
When sampling populations are difficult to access
(e.g. drug users, street based sex workers).
When very strict inclusion and exclusion criteria are
necessary (e.g. in pharmaceutical drug testing)
21. 1. Convenience sampling
Units ‘selected’ based on ease of access
Volunteers
Shoppers
in a supermarket
Respondents to advertisements
Clinic attendees
The sample usually is different from the target
population
Cannot
generalise results to general population
22. 2. Quota sample
Population divided into defined subgroups
e.g.
males; females
Proportions of subgroups in population
identified
Convenience sample of each subgroup to
make up required numbers
23. 3. Purposive sample
Deliberate selection of individuals by
researchers based on a predefined criteria INCLUSION & EXCULSION CRITERIA
Often
used in pharmaceutical drug testing
Also called judgmental sampling
24. 4. Snowball sampling
Involves asking subjects to provide names of
others who may meet study criteria
Useful
for sampling populations difficult to access
Also called networking
drug users
street-based sex workers
underground networks
26. Measurement issues
Error- validity
when an estimate (eg, incidence, prevalence, mortality) or
association (RR, OR) deviates from ‘true’ situation in nature
May be introduced at any point during the
study:
Study design (quality)
sampling
Random error
Measurement
Analysis
Systematic bias
27. Random error
Fluctuations around a true value
Related to poor precision
Sources
individual
biological variation (always present)
sampling variation
measurement variation (protocols and training)
Reduced by:
larger sample sizes
standard protocols and equipment
28. Systematic bias
Any systematic error in the design, conduct or
analysis of a study that results in a mistaken
estimate of an exposure’s effect on the risk of a
disease
Due to causes other than random error
Problem of validity
internal and/or external validity
29. I. Selection bias
Arises when different criteria are used so the
study population does not represent the
population of interest
for example:
1.
2.
3.
4.
Referral Bias (Berkson’s Bias)
Surveillance Bias
Prevalence-Incidence Bias (Neyman’s Bias)
Response Bias
Attrition Bias
Participation Bias
30. Types of bias
Referral bias
Occurs in case-control studies conducted in hospitals
Causes a spurious association between the exposure and the
disease, because of the different probabilities of admission to
a hospital for those with/without a disease (or with/without the
exposure)
Surveillance bias
For example:
When conducting a case-control study to examine the relationship
between oral contraceptive (OC) use and diabetes
Women taking OCs are likely to have more Dr visits, so diabetes is more
likely to be diagnosed in OC users than in non-OC users
31. 3. Prevalence-incidence bias
Also known as Neyman’s bias
Usually occurs when prevalent cases are used to
investigate a disease-exposure association
Prevalent
cases represent survivors, who may
be atypical with respect to exposure status
Once a person is diagnosed with the disease,
they may change their exposure
32. Types of bias
Participation bias
People who participate in research studies are often
different to those who do not take part.
Demographic, socioeconomic, cultural, lifestyle, and medical
characteristics
Self-selection bias (individual consent is essential in research,
except public available information)
Attrition bias
Occurs when study participants withdraw before the
study is completed and is often differential
33. II. Information bias
Arises when inaccurate measurement or
misclassification of study variables occurs
Can affect exposure or outcome (or even
confounders)
Extent of bias depends on the particular
variable
whether
non-differential or differential
misclassification
34. Non-differential info-bias
Error in measurement does not vary according to
other variables (cases vs controls; exposed vs
unexposed)
Underestimate of the true association
Any association that is observed is likely to be true
36. Types of information bias
1. Recall Bias
cases and controls recall their exposures differently
It is human being’s nature to looking for reasons if
something went wrong
“If you seek, you will find.”
2. Detection Bias
the exposed group is monitored more closely
3. Interviewer/observer Bias
Not blinded
Not properly trained
37. Types of information bias
4. Reporting Bias
“Objectively”
Cases tend to have better information
Individuals who are part of a study may
behave differently (Hawthorne effect)
“Subjectively”
Reluctant to report: attitudes, beliefs, perception
Wish bias: subjects attempting to answer the
question of “why me?” and the disease is not their
fault (lifestyle), but others (work related exposure)
38. III. Confounding - definition
An association between a given exposure and outcome
is influenced by a third variable – confounding
factor.
To be a confounder:
1.
Be a risk factor for disease
2.
Be associated with the exposure
3.
Not a result of the exposure
Not be an intermediate between exposure and the
outcome (i.e must not lie on the causal pathway)
39. Validity
Do the study conclusions reflect the true
value/relationship?
External validity (generalisability): can the findings
be generalised to other similar samples or the
population-at-large?
Internal validity: are the results correct for the
particular group you have studied?
40. Reliability
Accuracy -- how close to the true population value is
your measurement value?
Assess accuracy by comparing to “gold standard”
Precision -- If you repeat your measurement/
sample selection/analysis on numerous occasions,
will you get consistent results?
Assess precision by inter-observer and intra-observer
comparisons