Impact Evaluation Training Curriculum - Activity 267

Chris Nicoletti
Activity #267: Analysing the socio-economic
impact of the Water Hibah on beneficiary
households and communities (Stage 1)
Impact Evaluation
Training Curriculum
Session 1
April 16, 2013

This material constitutes supporting material for the "Impact Evaluation in Practice" book. This additional material is made freely but please acknowledge
its use as follows: Gertler, P. J.; Martinez, S., Premand, P., Rawlings, L. B. and Christel M. J. Vermeersch, 2010, Impact Evaluation in Practice: Ancillary
Material, The World Bank, Washington DC (www.worldbank.org/ieinpractice). The content of this presentation reflects the views of the authors and not
necessarily those of the World Bank.
MEASURING IMPACT
Impact Evaluation Methods for Policy
Makers

3
• My name is Chris Nicoletti
• From NORC
• Senior Impact Evaluation Analyst.
• Worked in Zambia, Ghana, Cape Verde, Philippines,
Indonesia, Colombia, Burkina Faso, etc.
• Live in Colorado
– I like to ski, hike, climb, bike, etc.
– Married and do not have any children
• What is your name?
• Let’s go around the room and do introductions…
Introduction…
Impact Evaluation Training Curriculum - Activity 267

4
Tuesday - Session 1
INTRODUCTION AND OVERVIEW
1) Introduction
2) Why is evaluation valuable?
3) What makes a good evaluation?
4) How to implement an evaluation?
Wednesday - Session 2
EVALUATION DESIGN
5) Causal Inference
6) Choosing your IE method/design
7) Impact Evaluation Toolbox
Thursday - Session 3
SAMPLE DESIGN AND DATA COLLECTION
9) Sample Designs
10) Types of Error and Biases
11) Data Collection Plans
12) Data Collection Management
Friday - Session 4
INDICATORS & QUESTIONNAIRE DESIGN
1) Results chain/logic models
2) SMART indicators
3) Questionnaire Design
Outline: topics being covered

5
Today, we will answer these
questions…
Why is evaluation valuable?
What makes a good impact
evaluation?
How to implement an impact
evaluation?
1
2
3
evaluation?
1
2
3

66
Today, we will answer these
questions…
evaluation?
1
2
3 How to implement an impact
evaluation?

77
Why Evaluate?
Need evidence on what works
Information key to sustainability
Improve program/policy implementation
1
2
3
 Limited budget and bad policies could hurt
 Design (eligibility, benefits)
 Operations (efficiency & targeting)
 Budget negotiations
 Informing beliefs and the press
 Results agenda and Aid effectiveness

8
Results-Based Management
is a global trend
Establishing links between monitoring and
evaluation, policy formulation, and budgets
Managers are judged by their programs’
performance, not their control of inputs:
A shift in focus from inputs to outcomes.
Critical to effective public sector management
What is new about results?

9
Monitoring vs. Evaluation
Monitoring Evaluation
Frequency Regular, Continuous Periodic
Coverage All programs Selected program, aspects
Data Universal Sample based
Depth of
Information
Tracks implementation,
looks at WHAT
Tailored, often to performance
and impact/ WHY
Cost Cost spread out Can be high
Utility
Continuous program
improvement, management
Major program decisions

10
Monitoring
A continuous process of collecting and analyzing
information,
 to compare how well a project, program or policy is
performing against expected results, and
 to inform implementation and program management.

1111
Impact Evaluation Answers
What was the effect of the program on
outcomes?
How much better off are the beneficiaries
because of the program/policy?
How would outcomes change if the
program design changes?
Is the program cost-effective?

12
Evaluation
A systematic, objective assessment of an on-going
or completed project, program, or policy, its design,
implementation and/or results,
 to determine the relevance and fulfillment of objectives,
development efficiency, effectiveness, impact and
sustainability, and
 to generate lessons learned to inform the decision making
process,
 tailored to key questions.

13
Impact Evaluation
An assessment of the causal effect of a project ,
program or policy on beneficiaries. Uses a
counterfactual…
 to estimate what the state of the beneficiaries would have
been in the absence of the program (the control or
comparison group), compared to the observed state of
beneficiaries (the treatment group), and
 to determine intermediate or final outcomes attributable
to the intervention.

1414
Impact Evaluation Answers
What is effect of a household (hh) water
connection on hh water expenditure?
Does contracting out primary health care
lead to an increase in access?
Does replacing dirt floors with cement
reduce parasites & improve child health?
Do improved roads increase access to
labor markets & raise income?

1515
Answer these questions
evaluation?
evaluation?
1
2
3

1616
How to asses impact
What is beneficiary’s test score with program
compared to without program?
Compare same individual with & without
programs at same point in time
Formally, program impact is:
α = (Y | P=1) - (Y | P=0)
e.g. How much does an education program
improve test scores (learning)?

1717
Solving the evaluation problem
Estimated impact is difference between treated
observation and counterfactual.
Counterfactual: what would have happened
without the program.
Need to estimate counterfactual.
Never observe same individual with and without
program at same point in time.
Counterfactual is key to impact evaluation.

1818
Counterfactual Criteria
Treated & Counterfactual
(1) Have identical characteristics,
(2) Except for benefiting from the intervention.
No other reason for differences in
outcomes of treated and counterfactual.
Only reason for the difference in
outcomes is due to the intervention.

1919
2 Counterfeit Counterfactuals
Before and After
Those not enrolled
 Those who choose not to
enroll in the program
 Those who were not offered
the program
Same individual before the treatment

2020
1. Before and After: Examples
You do not take into consideration things
that are changing over the intervention
period.
Agricultural assistance program
 Financial assistance to purchase inputs.
 Compare rice yields before and after.
 Before is normal rainfall, but after is drought.
 Find fall in rice yield.
 Did the program fail?
 Could not separate (identify) effect of financial
assistance program from effect of rainfall.

2121
2.Those not enrolled: Example 1
Compare employment & earning of those
who sign up to those who did not
Job training program offered
Who signs up?
Those who are most likely to benefit -i.e. those with more
ability- would have higher earnings than non-participants
without job training
Poor estimate of counterfactual

22
What’s wrong?
Selection bias: People choose to participate
for specific reasons
1
2
3
 Job Training: ability and earning
 Health Insurance: health status and medical
expenditures
Many times reasons are related to the
outcome of interest
Cannot separately identify impact of the
program from these other factors/reasons

23
Possible Solutions???
Need to guarantee comparability of treatment
and control groups.
ONLY remaining difference is intervention.
In this training we will consider:
 Experimental Designs
 Quasi-experiments (Regression Discontinuity, Double
differences)
 Non-experimental (or) Instrumental Variables.
EXPERIMENTAL DESIGN!!!

2424
Answer these questions
evaluation?
evaluation?
1
2
3

25
When to use Impact
Evaluation?
Evaluate impact when project is:
 Innovative
 Replicable/scalable
 Strategically relevant for reducing
poverty
 Evaluation will fill knowledge gap
 Substantial policy impact
Use evaluation within a program to test
alternatives and improve programs

26
Choosing what to evaluate
Criteria
 Large budget share
 Affects many people
 Little existing evidence of impact for
target population (IndII Examples?)
No need to evaluate everything
Spend evaluation resources wisely

27
IE for ongoing program
Development
Are there potential program
adjustments that would benefit from a
causal impact evaluation?
Implementing parties have specific
questions they are concerned with.
Are there parts of a program that may
not be working?

28
How to make evaluation
impact policy focused
Example: Scale up pilot? (i.e., Water Hibah)
Criteria: Need at least a X% average increase in beneficiary
outcome over a given period
Address policy-relevant questions
 What policy questions need to be answered?
 What outcomes answer those questions?
 What indicators measures outcomes?
 How much of a change in the outcomes
would determine success?

29
Policy impact of evaluation
What is the policy purpose?
Provide evidence for pressing decisions
Design evaluation with policy makers
IndII Examples???

30
 Decide what need to
learn.
 Experiment with
alternatives.
 Measure and inform.
 Adopt better alternatives
overtime.
Policy impact of evaluation
Change in incentives
 Rewards for changing programs.
 Rewards for generating knowledge.
 Separating job performance from knowledge generation.
Cultural shift
From retrospective evaluation
to prospective evaluation.
Look back and judge

31
• Choosing what to evaluate is something that
should take time and careful consideration.
• Impact evaluation is more expensive and often
requires third party consultation.
• The questions that require an IE to answer should
be evident in your logic models and M&E plans
from the beginning.
• Remember, IE is an assessment of the causal effect of
a project, program or policy on beneficiaries.
Choice should come from existing
logic models and M&E plans.

CHOICE #1
Retrospective Design or Prospective Design?

33
Retrospective Analysis
Retrospective Analysis is necessary when we
have to work with a pre-assigned program
(expanding an existing program) and existing data
(baseline?)
Examples:
 Regression Discontinuity: Education Project (Ghana)
 Difference in Differences: RPI (Zambia)
 Instrumental variables: Piso firme (México)

34
• Use whatever is available – the data was not collected for
the purposes at hand.
• The researcher gets to choose what variables to test, based on
previous knowledge and theory.
• Subject to misspecification bias.
• Theory is used instrumentally, as a way to provide a
structure justifying the identifying assumptions.
• Less money on data collection (sometimes), more money
on analysis.
• Does not really require “buy in” from implementers or field
staff.
Retrospective Designs

35
Prospective Analysis
In Prospective Analysis, the evaluation is
designed in parallel with the assignment of the
program, and the baseline data can be gathered.
Example: Progresa/Oportunidades (México)
CDSG (Colombia)

36
• Intentionally collect data for the purposes of the impact
evaluation.
• The variables collected in a prospective evaluation are
collected because they were considered potential
outcome variables.
• You should report on all of your outcome variables.
• The evaluation itself may be a form of treatment.
• It is the experimental design that is instrumental - gives
more power both to test the theory and to challenge it.
• More money on data collection, less money on analysis.
• Requires “buy in” from implementers and field staff.
Prospective Designs

37
Prospective Designs
Use opportunities to generate good controls
The majority of programs cannot assign
benefits to all the entire eligible population
Not all eligible receive the program
Budget limitations:
 Eligible beneficiaries that receive benefits are potential treatments
 Eligible beneficiaries that do not receive benefits are potential
controls
Logistical limitations:
 Those that go first are potential treatments
 Those that go later are potential controls

38
• The decision to conduct an impact evaluation was
made after the program began, and ex post
control households were identified.
• We are now trying to use health data from
Puskesmas to “fill in the gaps” of the baseline.
• This would be a retrospective design, because
there was not an experimental design in place for
the roll out of the program.
An example: Socio-econ
impact of Endline Water Hibah

CHOICE #2
What type of Evaluation Design do you
use?

40
Types of Designs
Prospective
Randomized Assignment
Randomized Promotion
Regression Discontinuity
Retrospective
Regression Discontinuity
Differences in Differences
Matching
Model-based / Instrumental Variable

41
How to choose?
Identification strategy depends on
the implementation of the program
Evaluation strategy depends on the
rules of operations

42
Who gets the program?
Eligibility criteria
 Are benefits targeted?
 How are they targeted?
 Can we rank eligible's priority?
 Are measures good enough for fine rankings?
Roll out
Equal chance to go first, second, third?

43
Rollout base on budget/administrative constraints
Ethical Considerations
Equally deserving beneficiaries deserve an equal
chance of going first
 Give everyone eligible an equal chance
 If rank based on some criteria, then criteria
should be quantitative and public
Equity
Transparent & accountable method
Do not delay benefits

44
The Method depends on
the rules of operation
Targeted Universal
In Stages
Without
cut-off
o Randomization
o Randomized
Rollout
With
cut-off
o RD/DiD
o Match/DiD
o RD/DiD
o Match/DiD
Immediately
Without
cut-off
o Randomized
Promotion
o Randomized
Promotion
With
cut-off
o RD/DiD
o Match/DiD
o Randomized
Promotion

45
• Provision of services to villages and households under the Water
Hibah is not determined by randomization, but by assessment and
WTP.
• The dataset design exhibits some characteristics of a controlled
experiment with connected and unconnected, but connection decision
is not determined by randomization.
• Household matching is not an efficient method with the potential
discrepancies we identified in the pilot test, and does not work very
well with the sample design that was chosen.
• Village-level matching is not feasible because there are usually
connected and unconnected in a single village (locality).
• The design we have chosen is: pretest-posttest-nonequivalent-
control-group quasi-experimental design that will use
regression-adjusted Difference-in-Difference impact
estimators.

CHOICE #3
What type of Sample Design do you
use?

47
Types of Designs
Random Sampling
Multi-Stage Sampling
Systematic Sampling
Stratified Sampling
Convenience Sampling
Snowball Sampling
Types of Sample Designs
Plus any combination of them!

48
• It is important to note that sample design can be
extremely complex.
• A good summary is provided by Duflo (2006):
• The power of the design is the probability that, for a given effect size and a given
statistical significance level, we will be able to reject the hypothesis of zero effect.
Sample sizes, as well as other (evaluation & sample) design choices, will affect
the power of an experiment.
• There are lots of things to consider, such as:
• The impact estimator to be used; The test parameters (power level, significance
level); The minimum detectable effect; Characteristics of the sampled (target)
population (population sizes for potential levels of sampling, means, standard
deviations, intra-unit correlation coefficients (if multistage sampling is used)); and
the sample design to be used for the sample survey
A good sample design requires
expert knowledge

49
The basic process is this…
Level of Power
Level of
Hypothesis
Tests
Correlations in
outcomes
within groups
(ICCS)
Mean and
Variance of
outcomes &
MDES

50
• Most times, you do not have all of this
information.
• Use existing studies; other data sources; assumptions.
• Working backwards to fit a certain power size.
• Working backwards b/c expected level of impact
that you want to test for.
• You are working backwards to fit a certain
budget!
• Build in marginal costs for each stage of sampling.
• Decide whether or not to pursue project.
The reality is…

51
• Outcome indicators: we have simplified versions of them
in the baseline, but they have been modified for endline 
Use baseline dataset to calculate ICCs.
• Highest variation in outcome indicators was identified across
villages (localities)  primary sample unit is the village.
• The # of households in the village was found to improve the
efficiency of the design  stratify villages based on the # of
households
• Marginal costs of village visit vs. household visit were
included.
• The final sample design that was identified is referred to
as: Stratified Multi-stage sampling with 250 villages and
7-14 households per experimental group = 7,000 hhs.

What can IndII Do?
Ensure your M&E systems are relevant
and reliable…

53
Data: Coordinate IE &
Monitoring Systems
Typical content
 Lists of beneficiaries
 Distribution of benefits
 Expenditures
 Outcomes
 Ongoing process evaluation
Projects/programs regularly collect data for
management purposes
Information is needed for impact evaluation

54
Manage M&E for results
 Tailor policy questions
 Precise unbiased estimates
 Use your resources wisely
Better methods
Cheaper data
Timely feedback and program changes
Improve results on the ground
Prospective evaluations are easier and
better with reliable M&E

55
Evaluation uses information to:
Verify who is beneficiary
When started
What benefits were actually
delivered
Necessary condition for program to
have an impact: Benefits need to
get to targeted beneficiaries.

56
Overall Messages
Evaluation design
Impact evaluation
Is useful for:
 Validating program design
 Adjusting program structure
 Communicating to finance ministry & civil society
A good one requires estimating the counterfactual:
 What would have happened to beneficiaries if had not
received the program
 Need to know all reasons why beneficiaries got program &
others did not

57
Other messages
Good M&E is crucial not only to effective project management but can be a
driver for reform
Monitoring and evaluation are separate, complementary functions, but both are
key to results-based management
Have a good M&E plan before you roll out your project and use it to inform the
journey!
Design the timing and content of M&E results to further evidence-based
dialogue
Good monitoring systems & administrative data can improve IE.
Easiest to use prospective designs.
Stakeholder buy-in is very important

58
Tuesday - Session 1
INTRODUCTION AND OVERVIEW
1) Introduction
2) Why is evaluation valuable?
3) What makes a good evaluation?
4) How to implement an evaluation?
Wednesday - Session 2
EVALUATION DESIGN
5) Causal Inference
6) Choosing your IE method/design
7) Impact Evaluation Toolbox
Thursday - Session 3
SAMPLE DESIGN AND DATA COLLECTION
9) Sample Designs
10) Types of Error and Biases
11) Data Collection Plans
12) Data Collection Management
Friday - Session 4
INDICATORS & QUESTIONNAIRE DESIGN
1) Results chain/logic models
2) SMART indicators
3) Questionnaire Design
Outline: topics being covered

Impact Evaluation Training Curriculum - Activity 267

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (15)

Semelhante a Impact Evaluation Training Curriculum - Activity 267

Semelhante a Impact Evaluation Training Curriculum - Activity 267 (20)

Mais de Indonesia Infrastructure Initiative

Mais de Indonesia Infrastructure Initiative (20)

Impact Evaluation Training Curriculum - Activity 267