Experimental and Quasi-Experimental Designs
Chapter 5
*
Introduction
Experiments are best suited for explanation and evaluation research
Experiments involve:
Taking action
Observing the consequences of that action
Especially suited for hypothesis testing
Often occur in the field
The Classical Experiment Classical experiment: a specific way of structuring researchInvolves three major components:
Independent variable and dependent variable
Pretesting and posttesting
Experimental group and control group
Independent and Dependent Variables
The independent variable takes the form of a dichotomous stimulus that is either present or absent
It varies (i.e., is independent) in our experimental process
The dependent variable is the outcome, the effect we expect to see
Might be physical conditions, social behavior, attitudes, feelings, or beliefs
Pretesting and Posttesting
Subjects are initially measured in terms of the DV prior to association with the IV (pretested)
Then, they are exposed to the IV
Then, they are remeasured in terms of the DV (posttested)
Differences noted between the measurements on the DV are attributed to influence of IV
Experimental and Control Groups
Experimental group: exposed to whatever treatment, policy, initiative we are testing
Control group: very similar to experimental group, except that they are NOT exposed
Can involve more than one experimental or control group
If we see a difference, we want to make sure it is due to the IV, and not to a difference between the two groups
Placebo
We often don’t want people to know if they are receiving treatment or not
We expose our control group to a “dummy” independent variable just so we are treating everyone the same
Medical research: participants don’t know what they are taking
Ensures that changes in DV actually result from IV and are not psychologically based
Double-Blind Experiment
Experimenters may be more likely to “observe” improvements among those who received drug
In a double-blind experiment, neither the subjects nor the experimenters know which is the experimental group and which is the control group
Selecting Subjects
First, must decide on target population – the group to which the results of your experiment will apply
Second, must decide how to select particular members from that group for your experiment
Cardinal rule – ensure that experimental and control groups are as similar as possible
RandomizationRandomization: produces an experimental and control group that are statistically equivalentEssential feature of experimentsEliminates systematic bias
Experiments and Causal Inference
Experimental design ensures:
Cause precedes effect via taking posttest
Empirical correlation exists via comparing pretest to posttest
No spurious 3rd variable influencing correlation via posttest comparison between experimental and control groups, and via randomization
Example of Research Using an Experimental Design
Researchers at the University of Marylan ...
Experimental and Quasi-Experimental DesignsChapter 5.docx
1. Experimental and Quasi-Experimental Designs
Chapter 5
*
Introduction
Experiments are best suited for explanation and evaluation
research
Experiments involve:
Taking action
Observing the consequences of that action
Especially suited for hypothesis testing
Often occur in the field
The Classical Experiment Classical experiment: a specific way
of structuring researchInvolves three major components:
Independent variable and dependent variable
Pretesting and posttesting
Experimental group and control group
Independent and Dependent Variables
The independent variable takes the form of a dichotomous
stimulus that is either present or absent
2. It varies (i.e., is independent) in our experimental process
The dependent variable is the outcome, the effect we expect to
see
Might be physical conditions, social behavior, attitudes,
feelings, or beliefs
Pretesting and Posttesting
Subjects are initially measured in terms of the DV prior to
association with the IV (pretested)
Then, they are exposed to the IV
Then, they are remeasured in terms of the DV (posttested)
Differences noted between the measurements on the DV are
attributed to influence of IV
Experimental and Control Groups
Experimental group: exposed to whatever treatment, policy,
initiative we are testing
Control group: very similar to experimental group, except that
they are NOT exposed
Can involve more than one experimental or control group
If we see a difference, we want to make sure it is due to the IV,
and not to a difference between the two groups
Placebo
We often don’t want people to know if they are receiving
treatment or not
We expose our control group to a “dummy” independent
variable just so we are treating everyone the same
Medical research: participants don’t know what they are taking
Ensures that changes in DV actually result from IV and are not
3. psychologically based
Double-Blind Experiment
Experimenters may be more likely to “observe” improvements
among those who received drug
In a double-blind experiment, neither the subjects nor the
experimenters know which is the experimental group and which
is the control group
Selecting Subjects
First, must decide on target population – the group to which the
results of your experiment will apply
Second, must decide how to select particular members from that
group for your experiment
Cardinal rule – ensure that experimental and control groups are
as similar as possible
RandomizationRandomization: produces an experimental and
control group that are statistically equivalentEssential feature of
experimentsEliminates systematic bias
Experiments and Causal Inference
Experimental design ensures:
Cause precedes effect via taking posttest
Empirical correlation exists via comparing pretest to posttest
No spurious 3rd variable influencing correlation via posttest
comparison between experimental and control groups, and via
randomization
4. Example of Research Using an Experimental Design
Researchers at the University of Maryland conducted an
evaluation of the Baltimore Drug Court using an experimental
design. For their research, eligible offenders were randomly
assigned to either the drug court or to ‘”treatment as usual”.
The results of the randomization process were given to the
judges as a recommendation. In most cases, the judges, who
agreed to participate in the study beforehand, sentenced
offenders in accordance with the randomization. The results
showed that participants in the drug court were less likely to
recidivate than those in the control group.
For more information see Gottfredson, D.C., Najaka, S.S. &
Kearley, B. (2003). Effectiveness of drug treatment courts:
Evidence from a randomized trial. Criminology, 2(2), 171-196.
Internal Validity Threats
Internal Validity: refers to the possibility that conclusions
drawn from experimental results may not reflect what went on
in experiment
History: external events may occur during the course of the
experiment
Maturation: people constantly are growing
Testing: the process of testing and retesting
Instrumentation: Changes in the measurement process
Internal Validity Threats: Continued
Statistical regression: extreme scores regress to the mean
Selection bias: the way in which subjects are chosen
Experimental mortality: subjects may drop out prior to
5. completion of experiment
Ambiguous Casual Time Order: the dependent variable actually
caused the change in the stimulus
Generalizability and Threats to ValidityResearchers also face
problems with generalizing results from
experimentsGeneralizability: do the results of an experiment
really tell us what would happen in the real world?
Construct Validity Threats
Construct validity: the correspondence between the empirical
test of a hypothesis and the underlying causal process that the
experiment represents
Link construct and measures to theory
Clearly indicate what constructs are represented by what
measures
Decide how much treatment is required to produce change in
DV
External Validity Threats
External validity: whether the results from experiments in one
setting will be obtained in other settings
Significant for experiments conducted under carefully
controlled conditions rather than more natural conditions
But, this reduces internal validity threats!
A conundrum!
Internal validity must be established before external validity is
6. an issue
Statistical Conclusion Validity Threats
Statistical conclusion validity: whether we are able to determine
if two variables are related
Becomes an issue when findings are based on small samples
More cases allows you to reliably detect small differences; less
cases result in detection of only large differences
Variations in the Classical Experimental DesignPost-test Only
design
No pretest measure is used
Used when pretest might bias results
Factorial Design
Two experimental groups are used
Used to determine necessary amount of treatment
Quasi-Experimental Designs
When randomization is not possible
quasi = “to a certain degree”
Quasi-Experiment: an experiment to a certain degree
Do not have as stringent of a control over internal validity
threats as true experiments
Two categories: non-equivalent-groups designs and time series
designs
Nonequivalent-Groups Designs
When we cannot randomize, we cannot assume equivalency;
7. hence the name
We take steps to make groups as comparable as possible
Match subjects in E and C groups using important variables
likely related to DV under study
Aggregate matching – comparable average characteristics
Cohort Designs
Cohort – group of subjects who enter or leave an institution at
the same time
Necessary to ensure that two cohorts being examined against
one another are actually comparable
Time-Series Designs
Examine a series of observations over time
Interrupted – observations compared before and after some
intervention
Instrumentation threat to internal validity is likely because
changes in measurements may occur over a long period of time
Often use measures produced by CJ organizations
Variations in Time-Series DesignsInterrupted Time Series
Design with a Non-Equivalent Comparison Group
Time-Series Design with Switching Replications
Variable-Oriented Research, Case Studies and Scientific
RealismCase-oriented research: many cases are examined to
understand a small number of variables
8. Variable-oriented research: a large number of variables are
studied for a small number of cases
Case studies: researcher centers on an in-depth examination of
one or a few cases on many dimensions
In-depth examinations of a few cases
*
Concepts, Operationalization, and Measurement
Chapter 4
*
Introduction
We want to move from vague ideas of what we want to study to
actually being able to recognize and measure it in the real world
Otherwise, we will be unable to communicate the relevance of
our idea and findings to an audience
Conceptions and Concepts
Conception: mental image we have about something
Concepts: words, phrases, or symbols in language that are used
to represent these mental images in communication
9. e.g., serious crime
Example of Concept
According to Gottfredson and Hirschi's General Theory of
Crime, low levels of self-control is the primary cause of crime.
Because self-control is a concept, how to conceptualize and
measure it has been debated extensively among academics.
Furthermore, the measuring of symptoms of levels of self-
control and the inability to measure self-control directly has led
some academics to argue that the General Theory of Crime is a
tautology and is, therefore, not testable.
Conceptualization Conceptualization: mental process of making
concepts more precise to specify what we mean
Results in a set of indicators and dimensions of what we have in
mind
Indicates a presence or absence of the concept we are studying
Serious crime = offender uses force (or threatens to use force)
against a victim
Indicators and Dimensions
Dimension – specifiable aspect of a concept
“Crime Seriousness” – can be subdivided into dimensions
e.g., dimension – victim harm
Indicators – physical injury, economic loss, psychological
consequences
Specification leads to deeper understanding
10. Creating Conceptualization Order
Conceptual definition: working definition specifically assigned
to a term, provides focus to our observations
Gives us a specific working definition so that readers will
understand the concept
E.g., Which dimensions of SES will be included?
Operational definition: spells out precisely how the concept will
be measured
E.g., How will we measure SES?
Progression of Measurement Steps
Conceptualization
Conceptual Definition
Operational Definition
Measurements in the Real World
*
Operationalization Choices
Operationalization – the process of developing operational
definitions
Moves us closer to measurement
Requires us to determine what might work as a data-collection
method
Measurement as “Scoring”
11. Measurement – assigning numbers or labels to units of analysis
in order to represent the conceptual properties
Make observations, and assign scores to them
Different measurement can produce different results
E.g., Time frame in which recidivism is measured might
produce different results
Exhaustive and Exclusive Measurement
Every variable should have two important qualities:
Exhaustive – you should be able to classify every observation in
terms of one of the attributes composing the variable
Mutually exclusive – you must be able to classify every
observation in terms of one and only one attribute
Example – Measure for Marijuana Use
Not exclusive or exhaustive
How many times in the last year have you smoked marijuana?
0
1-3
3-6
6-9
Reworded to be exclusive or exhaustive
How many times in the last year have you smoked marijuana?
0
1-2
3-6
7-9
10 or more times
Levels of Measurement
12. Nominal: offer names or labels for characteristics (e.g., race,
gender, state of residence)
Ordinal: attributes can be logically rank-ordered (e.g.,
education, opinions, occupational status)
Interval: meaningful distance between attributes (e.g.,
temperature, IQ score from an intelligence test)
Ratio: has a true zero point (e.g., age, number of priors,
sentence length, income)
Implications of Levels of Measurement
Different analytical analysis require certain levels of
measurement
Higher levels can be converted to lower levels
Lower levels cannot be converted to higher levels
Therefore, seek the highest level of measurement possible
Criteria for Measurement Quality
Measurements can be made with varying degrees of precision
The more precise, the better
Should not sacrifice accuracy
Reliability
Reliability: whether a particular measurement technique,
repeatedly applied to the same object, would yield the same
result each time
Problem – even if the same result is retrieved, it may be
incorrect every time
Reliability does not insure accuracy
Observer’s subjectivity might come into play
13. Methods of Dealing with Reliability Issues
Test-retest method – make the same measurement more than
once – should expect same response both times
Interrater reliability – compare measurements from different
raters; verify initial measurements
Validity
The extent to which an empirical measure adequately reflects
the meaning of the concept under consideration
Are you really measuring what you say you are measuring?
Demonstrating validity is more difficult than demonstrating
reliability
Methods of Dealing with
Validity Issues
Face validity: on its face, does it seem valid? Does it jibe with
our common agreements and mental images?
Criterion-related validity: compares a measure to some external
criterion
Construct validity: whether your variables related to each other
in the logically expected direction
Multiple measures: compare measure with alternative measures
of the same concept
Measuring CrimeCrime can be a dependent variable in
exploratory, descriptive, explanatory, and applied studiesCrime
can also be an independent variable, as in a study of how crime
14. affects fear and other attitudesIt can be both: the relationship
between drug use and other offenses
General Issues in Measuring Crime
How are do you conceptualize crime?
What units of analysis?
Specific entities about which researchers collect information
Offender, victim, offenses, incidents
What purpose? e.g., monitoring, agency accountability, research
Measures Based on Crimes Known to Police
Most widely used measures of crime are based on police records
Certain types are detected almost exclusively by observation
(traffic and victimless offenses)
Most crimes reported by victim or witnesses
What crimes are not measured well by police records?
Uniform Crime Reports (UCR)
Originally, reporting voluntary, but now very common
Type I offenses (index crimes/offenses): murder, rape, robbery,
larceny, burglary, aggravated assault, motor vehicle theft and
arson (added in 1979)
Type II offenses: a compilation of less serious crimes
Summary-based, group level unit of analysis
Assumptions of UCR
Citizens know an offense has occurred
Citizen reports offense to the police
15. Officer can verify that the offense occurred
Officer decides the offense deserves to be reported
Agency’s numbers end up being forwarded to FBI on time
Positives of UCR
Can compare agencies
Quick, easy, and efficient
Index offenses are valid indicators of public’s crime concerns
Negatives of UCR
Doesn’t count ALL crimes reported to police
Jurisdictions vary in completeness of crime data they provide to
FBI; voluntary
Can suffer from clerical, data processing, political problems
Hierarchy rule – only most serious crime counted in an incident
Summary-based measure: UCR data include summary crime
counts from reporting agencies
Incident-Based Police Records
Incident-based measures: the incidence of crime is the unit of
analysis
Supplementary Homicide Reports (SHR)
Police agencies submit detailed info about individual homicide
incidents
Can conduct a variety of studies that examine individual events
National Incident-Based Reporting System (NIBRS)
Joint effort by FBI and BJS to convert UCR to a NIBRS
16. NIBRS reports each crime incident rather than the total number
of certain crimes for each LE agency
Many features are reported individually about each incident,
offenses, offenders, victims
UCR – 7 Part I offenses, NIBRS – 46 Group A offenses
Other Revisions with NIBRS
Hierarchy rule dropped
Victim type (individual, business, government, society/public)
Attempted/completed.
Computer-based submission
Drug-related offenses
Computers and crime
Quality control; states require certification
National Crime Victimization Survey (NCVS)
Victimization survey: asks people whether they have been the
victim of a crime
Since 1972 by Census Bureau
Sought to illuminate the “dark figure of crime”
Longitudinal panel study: households agree to participate for 3
years (7 interviews; one every 6 months) and then replaced
Does not measure all crime
Respondents are asked screening questions
Positives of NCVS
Measures both reported and unreported crime
Independent of changes in reporting
More information about how crime impacted victim than UCR
Provides more victim characteristics than UCR
17. Negatives of NCVS
Telescoping incident dates
Faulty memory
Little information on offenders
No information on CJS response if reported
Excludes crimes against commercial establishments
Only includes residents of US
Surveys of Offending
Self-report surveys: ask people about crimes they may have
committed
Useful in measuring crimes that are poorly measured by other
techniques (prostitution, drug abuse, public order, delinquency)
Useful in measuring crimes rarely reported to police
(shoplifting, drunk driving)
Two ongoing self-report studies – NSDUH & MTF
National Survey on Drug Use and Health (NSDUH)
Based on a national sample of households
Conducted since 1971; 2004 sample had n=68,000
Includes questions to distinguish between lifetime use, current
use, and heavy use
Encourages candid responses via procedures
Includes residents of college dorms, rooming houses, and
homeless shelters
Monitoring the Future (MTF)
18. Conducted since 1975 by the National Institute on Drug Abuse
Includes several samples of high school students and others,
totaling about 50,000 respondents each year
Questions concern self-reported use of alcohol, tobacco, illegal
drugs, delinquency, other acts
A subset of 2,400 MTF respondents receive follow-up
questionnaire
Composite Measures
Allows us to combine individual measures to produce more
valid and reliable indicators
Typology: produced by the intersection of two or more variables
to create a set of categories or types
e.g., Typology of Delinquent/Criminal Acts (Time 1 and 2)
None, Minor (theft of items worth less than $5, vandalism, fare
evasion), Moderate (theft over $5, gang fighting, carrying
weapons), Serious (car theft, breaking and entering, forced sex,
selling drugs
Nondelinquent, Starter, Desistor, Stable, Deescalator, Escalator
Index of Disorder
What is disorder? (Skogan, 1990)
Distinguish b/w physical presence & social perception
Physical disorder: abandoned buildings, garbage and litter,
graffiti, junk in vacant lots
Social disorder: groups of loiterers, drug use and sales,
vandalism, gang activity, public drinking, street harassment
Index created by averaging scores for each measure