3. Origins of RTE
Pivotal conflicts
Persian Gulf Crisis, Rwanda, Kosovo
Early agencies
UNHCR, World Bank, Danida
Humanitarian reform
OCHA, Reliefweb, IASC, ALNAP
4. Rationale
Humanitarian evaluation “tends to
mirror humanitarian practice – it is often
rushed, heavily dependent on the skills
of its key protagonists, ignores local
capacity, is top-down, and keeps an
eye on the media and public relations
implications of findings”
- Feinstein & Beck (2006)
5. Rationale
“atheoretical and method-driven”– a
less thoughtful and rigorous cousin of
mainstream evaluation.
- Feinstein & Beck (2006)
The „wild west‟ of evaluation
- AES conference Canberra, 2009
6. The centrality of theory
Without a strong theoretical base, “we
are no different from the legions of
others who also market themselves as
evaluators today”
- Shadish (1998)
7. Aid evaluation emergent
At the time of the Rwanda evaluation,
“there were no manuals, guidelines or
good practice notes to follow on
evaluating humanitarian action”
- Beck (2006)
8. The value of research
on evaluation
Rigorous and systematic study “can
provide essential information in the
development of an evidence base for
a theoretically rooted evaluation
practice, as well as provide the
evidentiary base for the development
of practice-based theory”
- Miller (2010)
9. Research questions
1. What is the conceptual logic
behind real-time evaluation?
2. How is real-time evaluation
applied in practice?
3. How can the theory and
practice of real-time
evaluation be strengthened?
10. Methodology
Drawn from:
Miller and Campbell (2006)
Multistage sampling approach; examination of
fidelity between theory and practice
Hansen et al (in press)
Logic modeling from coding framework
12. Espoused theory
Six items of literature
Broughton, WFP (2001)
Jamal and Crisp, UNHCR (2002)
Sandison, UNICEF (2003)
Cosgrave, Ramalingham & Beck (2009)
Waldon, Scott & Lakeman (2010)
Brusset, Cosgrave, MacDonald, ALNAP (2010)
13. Logic of theory
Context Activities Consequences / effects
4-12 weeks into large,
rapid onset single or Data collection via semi-
Realistic design Evaluation is
multi-agency program. structured interviews, observation
and flexible timely, credible
May be concerns about with purposive sampling.
plans. and responsive to Country team
performance. Reflection workshops, focus
Timeliness is information learning, reflection
vital. May groups, limited document analysis. and improved
needs.
utilize a series morale.
of visits.
Internal and external Analysis concurrent with data
stakeholders, under collection, with input from country
Evaluator
great pressure. team via workshop.
Planning with credibility from Immediate
stakeholders effective planning instrumental use.
essential. and short-term Stronger
Stakeholder participation:
improvements. understanding of
Country team involvement crucial,
beneficiary engagement strongly context. Better
HQ and donors need
encouraged. Management guidelines and
information. Evaluator's policy. Greater
role is responds to findings. Stronger transparency.
impartial organizational
outsider, capacity learning
advisor and Multiple methods of report and decision-
1-4 highly skilled facilitator. Must generation. Written reports making in situ,
evaluators with a secondary to briefings and Improved
collate credible M&E systems and
supportive manner, workshops. Rapid dissemination. outcomes for
information. institutional
diverse backgrounds Use of linking tool. survivors of
learning.
humanitarian
emergencies
Assumptions External factors
Emergency response is difficult to monitor and evaluate. RTE is more interactive than Complex and difficult programming environment, subject to rapid changes. Time,
standard humanitarian evaluation. Utilization is increased with staff engagement. logistics and security constraints.
Organizational change is slow.
15. Logic of practice
Figure 1: Logic model for real-time evaluation from RTE reports
Context Activities Consequences / effects
Large-scale single- or Planning via a
multi-agency response to reference group Personnel, beneficiaries and RTE promotes
sudden-onset or rapidly and pre-RTE field external stakeholders provide data. staff reflection,
mission. Terms of Findings depend communication,
deteriorating crises. Soon Formal management response.
reference set, upon coordination,
after establishment phase.
then evaluators recollections of learning and
Learning opportunities
develop design informants,
present. accountability.
after initial data triangulated.
collection and Data collection: semi-structured
consultation. interviews, primarily with field
Information needed for personnel. Documents, field visits Detailed action
field personnel, external and focus groups. Evaluator
plans established
stakeholders and future Single or multi- credibility
with the
organizational response. phase design established
ownership of
with light through
teams. Broader
footprint. 11-21 transparency and
Findings and recommendations policy
field days. 36-111 meta-evaluation.
Primary stakeholders are reviewed via reflection workshops development.
agency staff and their informants, 40- with field teams.
partners. Will be under 133 beneficiaries.
pressure. Organizational
Improved
capacity
outcomes for
3-4 internal or external Evaluator enhanced via
Multiple reporting methods survivors of
evaluators with sectoral provides support, reflection,
including oral presentations at humanitarian
and management guidance, outside communication
field and headquarters.19-43 page emergencies.
expertise and balanced perspective and and chronicling
reports.
profiles. accountability. events.
Assumptions External factors
Early deployment leads to influence. Ownership and participation faciliate Frequent changes, rapid staff turnover of staff and competing field
utilization. RTEs measure opinions but not impact. RTEs will result in important missions. Sensitive political environment, security threats. Compromised
lessons learned. infrastructure, living and working conditions. Remote travel required.
Visas and recruitment cause delays.
16. Contrasts in logic models
• Theory: Concerns about
programme performance
Impetus
• Practice: Silent on these
concerns
• Theory: Agency knowledge
• Practice: External evaluators
Evaluator
with sectoral expertise and
diverse backgrounds
• Theory: Field based planning
Planning
• Practice: Reference groups
17. Contrasts in logic models
• Theory: Field and management response.
Stakeholders • Practice: More optimistic picture of
beneficiary consultation.
• Theory: Effective planning.
Credibility • Practice: Relationships, transparency,
meta-evaluation.
Organizational • Theory: Establishing M&E systems.
capacity
18. Contrasts in logic models
• Theory: Learning
Process use • Practice: Communication and
coordination.
• Theory: Understanding at
headquarters.
Utilization
• Practice: Field team ownership
and action plans.
Constraints • Practice: Political environment.
19. Contrasts in logic models
• Theory: Modest expectations of
organizational change.
Assumptions
• Practice: The importance of
lessons learned.
• Practice has a stronger
Overall emphasis on bottom-up
influence and approaches.
21. Change in scores
pre- and post- ALNAP guide
Post-March
Element Pre-March 2009 % change
2009
Median no of 136
40 beneficiaries 240%
beneficiaries beneficiaries
Matrix of
recommendation 9% 29% 222%
s
5 to 10
16% 33%
recommendation 106%
average 24 average 15
s
Inception report
9% 17% 89%
included
List of
41% 63% 54%
informants
Group
50% 67% 34%
22. Change in scores
pre- and post- ALNAP guide
Element Pre-March 2009 Post-March 2009 % change
Workshop in field 59% 75% 27%
Average fidelity 45% 53%
18%
score 6.25 out of 14 7.38 our of 14
Beneficiary
78% 88% 13%
consultation
1 to 4 evaluators 81% 88% 9%
7 to 21 days in 50% 54%
8%
field median 13 days median 16 days
Average fidelity 45% 53%
18%
score 6.25 out of 14 7.38 our of 14
Beneficiary
78% 88% 13%
consultation
23. Change in scores
pre- and post- ALNAP guide
Post-March
Element Pre-March 2009 % change
2009
Describes
97% 92% -5%
methods
Describes
triangulation and 38% 33% -13%
validity
Timeline 38% 33% -13%
Report 15 to 40 59% 46%
-22%
pages average 30 pages average 38 pages
Matrix of evidence 0% 21% N/A
Median change (all scores) 18%
24. Highest fidelity scores
Found in Humanitarian
Accountability Project, IASC and
among external evaluators.
Many of these evaluators also
contributed to the literature on RTE.
25. Lowest scores
Found in mixed and internal teams,
multi-country evaluations.
Some reports seem to be labeled RTE
simply for its cachet.
26. In theory
Is RTE new?
Though described as innovative, it has many
antecedents outside the humanitarian field.
Is RTE evaluation at all?
„Purists‟ would argue that it‟s
pseudoevaluation. RTE is part of an
increasingly vague distinction between
evaluators and organizational development
consultants.
28. Utilization-focused
evaluation
Type of use Potential for real-time evaluation
Influence actions and decisions.
Instrumental Develop action plans.
Change policies and programs.
Lessons learned for country teams,
Conceptual
headquarters and donors.
Information for donors.
Symbolic
Demonstrate transparency, accountability.
Process Communication, coordination and morale.
29. Developmental
evaluation
Program in a continuous state of
change; operations will never
become fixed or stable.
Patton (2008)
Not to prove
…but to improve
Krueger & Sagmeister (2012), Stufflebeam (2004)
30. Connoisseurship
“There are no algorithms, rules, recipes
or the like to use”
Eisner (2004)
Expert-led, lightweight and agile
design.
Credibility (and supply) of experts a
key limitation.
Stufflebeam & Shinkfield (2007), Miller (2010)
31. Summary
There is a strengthening relationship
between theory and practice
A strong logic is emerging from RTE
RTE has roots in mainstream evaluation,
especially developmental and
utilization-focussed approaches. Must
be wary of the risks of connoisseurship.
32. Suggestions
Humanitarian evaluators: stronger
engagement with theory and better
training in guidance.
Mainstream theorists: attention to the
specificities of emergencies, to adapt
traditional models.
Further research on evaluation in
humanitarian programs.
33. Thank you
Jess Letch
Masters candidate
University of Melbourne, Australia
jessicaletch@gmail.com
Special thanks to supervisor Brad Astbury
Special acknowledgement to Ros Hurworth
34. References
Alkin, M. C., & Christie, C. A. (2004). An evaluation theory tree. In M. C. Alkin (Ed.), Evaluation roots: tracing
theorists' views and influences (pp. 12-65). Thousand Oaks, Calif.: Sage Publications.
Beck, T. (2006). Evaluating humanitarian action using the OECD-DAC criteria: An ALNAP guide for
humanitarian agencies: ALNAP.
Broughton, B. (2001). Proposal Outlining a Conceptual Framework and Terms of Reference for a Pilot Real-
Time Evaluation (O. O. o. Evaluation), Trans.). Canberra: World Food Program.
Brusset, E., Cosgrave, J., & MacDonald, W. (2010). Real-time evaluation in humanitarian emergencies.
[Article]. New Directions for Evaluation(126), 9-20.
Cosgrave, J., Ramalingham, B., & Beck, T. (2009). Real-Time Evaluations of Humanitarian Action: An ALNAP
Guide (Pilot Version): ALNAP.
Eisner, E. (2004). The roots of connoisseurship and criticism: A personal journey. In M. C. Alkin (Ed.),
Evaluation Roots: Tracing Theorists' Views and Influences (pp. 8p). Thousand Oaks, Calif.: Sage Publications.
Feinstein, O., & Beck, T. (2006). Evaluation of Development Interventions and Humanitarian Action. In I. F.
Shaw, J. C. Greene & M. M. Mark (Eds.), The Sage Handbook of Evaluation. London: Sage.
Hansen, M., Alkin, M. C., & LeBaron Wallace, T. (in press). Depicting the logic of three evaluation theories.
Evaluation and Program Planning. doi: 10.1016/j.evalprogplan.2012.03.012
Jamal, A., & Crisp, J. (2002). Real-Time Humanitarian Evaluations: Some Frequently Asked Questions (E. a. P.
A. Unit, Trans.): UNHCR.
Krueger, S., & Sagmeister, E. (2012). Real-Time Evaluation of Humanitarian Assistance Revisited: Lessons
Learned and the Way Forward. Paper presented at the European Evaluation Society, Helsinki.
Miller, R. L. (2010). Developing standards for empirical examinations of evaluation theory. American Journal
of Evaluation, 31(390). doi: 10.1177/1098214010371819
35. References
Miller, R. L., & Campbell, R. (2006). Taking stock of empowerment evaluation: An empirical review.
American Journal of Evaluation, 27(3), 296-319.
Owen, J. M., & Rogers, P. J. (1999). Program Evaluation: Forms and approaches Retrieved from SAGE
Research Methods database Retrieved from http://srmo.sagepub.com/view/program-
evaluation/SAGE.xml doi:10.4135/9781849209601
Patton, M. Q. (2008). Utilization-Focused Evaluation. Thousand Oaks, California: Sage Publications.
Sandison, P. (2003). Desk Review of Real-Time Evaluation Experience. New York: UNICEF.
Shadish, W. R. (1998). Evaluation theory is who we are. American Journal of Evaluation, 19(1), 18.
Shadish, W. R., Cook, T. D., & Leviton, L. C. (1991). Foundations of program evaluation: theories of practice.
Newbury Park, CA: Sage Publications.
Stake, R. E. (2004). Stake and Responsive Evaluation. In M. C. Alkin (Ed.), Evaluation Roots: Tracing Theorists'
Views and Influences (pp. 204-216). Thousand Oaks, Calif.: Sage Publications. Retrieved from
http://www.loc.gov/catdir/toc/ecip048/2003019866.html.
Stufflebeam, D. L. (2004). The 21st Century Cipp Model. In M. Alkin (Ed.), Evaluation Roots: Tracing Theorists'
Views and Influences (pp. 245-266). Thousand Oaks, Calif.: Sage Publications.
Stufflebeam, D. L., & Shinkfield, A. J. (2007). Evaluation theory, models, and applications / Daniel L.
Stufflebeam, Anthony J. Shinkfield: San Francisco : Jossey-Bass, c2007.
Walden, V. M., Scott, I., & Lakeman, J. (2010). Snapshots in time: using real-time evaluations in humanitarian
emergencies. Disaster Prevention and Management, 19(3), 8.
Notas do Editor
CriteriaComprehensive description or guidance for the rationale, theory and practice of real-time evaluation.Discernable influence upon the subsequent development of real-time evaluation.Consolidation of practice wisdom or empirical research.
Of the 56 reports identified, 47 described operations in single countries. Nine of the reports spanned two or more countries. The countries with the most studied programmes were Pakistan (n=8)and Haiti (n=6). Together these countries represented 25% of the sample. All six of the Haiti evaluations (from five different agencies) related to the 2010 earthquake. The Pakistan evaluations, from four different agencies, examined the earthquake, floods and civil conflict. The largest number of evaluation studied related to conflict (n=16) or situations of violence (n=3). 59 per cent (n=33) related to natural disasters: cyclone, drought, earthquake, epidemic, flood or tsunami. Floods (n=8) were the most studied response; five of these reports related to floods in Pakistan.
Overall, yes – a modest upward trend in the similarity between theory and practice according to the proxy indicators – but still a wide distribution, a lot of variance between actors.We would expect this, as theory should inform practice and vice versa. Indeed, many of those writing the guiding texts are also producing the case examples.Measurable proxy indicators of similarity between theory and practice:Does the report describe the methods used to carry out the evaluation?Does the report have 5-10 recommendations? Is there an inception report included in the report?Does the report include a matrix of evidence? Does the report include a matrix of recommendations?Does the report refer to methods used to triangulate and validate data?Is there a timeline of events related to the humanitarian emergency?Does the report refer to consultation or data collection with beneficiaries?Does the report refer to group interviews with the affected population?Does the evaluation team include 1-4 evaluators?Is the final report (excluding annexes) 15-40 pages?Does the evaluation team spend 7-21 days in the field?Does the report mention a results workshop in the field?
Is RTE new?Important question, as we have an approach called “real-time evaluation” that has emerged seemingly out of thin air. Many writers claim that RTE is innovative. Is this a new approach, or is it an existing approach with a new name? Writers on RTE point to formative and developmental evaluation.In some ways RTE is no different to other rushed field reviews. The only difference is the timing.Is RTE evaluation?It is important to ask this question: RTE doesn’t conform to the strict old-fashioned notion of evaluation as a “systematic investigation of the worth or merit of an object”*Here, different schools emerge. Those who see evaluation as satisfying a wide-range of uses. Others who see this as pseudoevaluation: a guiding hand from a technical expert, an organizational development consultant’s role and not much of an evaluation at all.*Joint Committee on Standards for Educational Evaluation. (1994). The programevaluation standards: How to assess evaluations of educational programs(2nd ed.). Thousand Oaks, CA: Sage.
1st Stage: methods and evidence2nd Stage: relationships and utilization3rd Stage: long-range learning plus utilizationMethods: truth and evidenceValuing: establishing the value of a program approachUse: instrumental or conceptual use by stakeholders
But lacks some elements of Quinn Patton’s approach:Establishing a leadership group among users to determine outcomes.Training staff on evaluation approaches and obtaining their buy-in.Sharing design decisions with a working team.Piloting data collection tools, running mock scenarios of findings.Involving staff in data interpretation and analysis.
To be distinguished from formative evaluation:The process of improving and preparing a program for post-hoc, summative evaluation (Patton, 2008). Can avert the difficulties that often plague evaluations (Scriven, 1991).In developmental evaluation, the program never becomes fixed.