A presentation by Prof. Karthik Muralidharan on research on achieving universal quality primary education in India. This was presented at the Commission for Science and Technology (COSTECH) in Dar es Salaam, Tanzania, on June 19, 2014, to an audience of researchers.
Karthik Muralidharan on research on achieving universal quality primary education in India
1. Karthik Muralidharan
UC San Diego, NBER, BREAD, and J-PAL
COSTECH
Dar Es Salaam, 19 July 2014
Achieving universal quality
primary education in India
Lessons from the Andhra Pradesh Randomized
Evaluation Studies (AP RESt)
3. 3
There have been sharp improvements
in various measures of school quality
in the past decade
Source: Kremer et al (2005) for 2003 data; Muralidharan et al (2013) for 2010 data; Enrollment data from World Bank (2003) and ASER (2010)
0
10
20
30
40
50
60
70
80
90
100
2003
2010
4. 4
Despite improvements in inputs,
learning levels are alarmingly low
Source: ASER 2012
Basic Arithmetic
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Children in class 1 who
can't subtract
Children aged 6-14 who
can't subtract
96%
59%
Basic Reading
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Children in class 1 who
can't read at grade level
Children aged 6-14 who
cannot read a 2nd-class
level paragraph
93%
62%
6. 6
Broad objectives of AP RESt
(Andhra Pradesh Randomized
Evaluation Studies)
• Measure and document levels and trajectories of student learning
• Imperative that policy be based on outcomes – very narrow window
for ‘demographic dividend’ (10-15 years at most)
Move the focus of
education policy
from outlays to
outcomes
Focus
systematically on
teacher motivation
and effectiveness
• Strong suggestive evidence that teachers are the main lever of
education policy in improving learning outcomes
• In India, over 90% of non-capital spending goes to teacher salaries
Improve the
empirical
orientation of
education policy
making by:
• Rigorous evaluations of what works and relative effectiveness of
different policy options
• Critical in a world of limited resources
• Budgetary increases must translate to improved outcomes
1
2
3
7. 7
APRESt is a multi-stakeholder
partnership
• Government of Andhra Pradesh (GoAP)
- Main client – project initiated at request of Principal Secretary, Education
- All relevant letters of permission and administrative support
- Financial contribution (cost of contract teachers; direct contribution)
• Azim Premji Foundation
- Main counterpart to MoU with GoAP
- Fully responsible for all aspects of project implementation, school communications, test
administration, and data collection
Over 50 full time project staff and 750 part-time evaluators
Continuous engagement with government
Financial contribution as well
• World Bank
- Technical support
- Financial support (mainly through DFID)
- Institutional continuity with government (6 secretaries in 6 years!)
• Educational Initiatives
- Test design and scoring, diagnostic and gain reports to schools
9. 9
How do you evaluate the impact of
large social sector programs?
Let’s use mid-day meals as our example:
What has been the impact of the mid-day meal program?
3: Compare to
appropriate
control
1: Define
outcomes
2: Measure
outcomes
• The control and treatment groups are similar in all
other ways except for the program
• The difference in the outcome measure between
the two is a measure of the impact of the mid-day
meals program
• Often, even this first step is not undertaken
• Let’s assume it is, and we define some outcomes, e.g. nutrition, attendance
and learning
• Is this a valid measure of the impact of the program?
• No, because there are many other things that have
changed at the same time
• Need a meaningful comparison group
2008
2003
Outcome
2008
2003
Outcome
Treatment Control
We use a randomised evaluation methodology: the “gold standard” in social science research
10. 10
We tested five specific
interventions, with a mix of input-
and incentive-based policies
Contract
teachers (mix
input-incentive)
Block grants
(input only)
Performance
pay ×2
(incentive only)
Feedback +
monitoring
(input only)
• Schools provided with
additional teacher (on
contract)
• Schools provided cash grants
for student inputs
• Existing teachers provided
with detailed feedback on
students and subject to low-
stakes monitoring
• Teachers eligible for bonuses
based on improved student
performance (either in own
class or whole school)
MOTIVATION INTERVENTION
• One reason learning levels may be low is
teachers don’t know how to help students
• Can better information help?
• Use of contract teachers is widespread, but
highly controversial
• Are contract teachers effective?
• Significant amounts of money committed
under RTE.
• What is the effectiveness of such spending?
• Teacher salaries are the largest component
of education spending in India, but a poor
predictor of outcomes
• Can linking pay to performance improve
outcomes?
11. 11
Location of study
• Andhra Pradesh (AP)
- 5th most populous state in India
Population of 80 million
- 23 Districts (2-4 million each)
• Close to All-India averages on many
measures of human development
India AP
Gross Enrollment
(6-11) (%)
95.9 95.3
Literacy (%) 64.8 60.5
Teacher Absence (%) 25.2 25.3
Infant Mortality
(per 1,000)
63 62
12. 12
Randomization was stratified at the
sub-district level
1. First, we chose 5 districts across three distinct ‘regions’ within AP
2. Then, within each district we randomly chose 10 mandals (blocks)
3. Then, within each mandal we randomly chose 12 schools
4. Finally, of these, we assigned 2 to each treatment and 2 to control
13. 13
Summary of Experimental Design
• Study conducted across a representative sample of 600 primary schools in AP
• Conduct baseline tests in these schools (June/July 05) [process pilots in 04-05]
• Stratified random allocation of 100 schools to each treatment (2 schools in each
mandal to each treatment) (August 05)
• Monitor process variables over the course of the year via unannounced
monthly tracking surveys (Sep 05 – Feb 06)
• Conduct 2 rounds of endline tests to assess the impact of various interventions
on learning outcomes (March/April 06)
• Interview teachers after program but before outcomes are communicated to
them (July 06)
• Continue interventions for measuring 2-year impact (July/August 06)
14. 14
Review of Key Steps
1
2
Define the research question(s)! Why does it matter? What is the
likely mechanism of impact?
Identify the evaluation methodology. Internal & external validity.
Why did an experiment make sense in this case?
Fine tune the details: pilot and refine measurement instruments,
power and sample size calculations, get feedback on design
Making it happen: Identify sites, implementation partners and
structure, permissions, funding, key personnel
3
4
Conduct baseline (is this always necessary)? Do randomization,
implement treatments, monitor process and outcomes
5
Data cleaning & management, analysis, writing papers/reports,
presenting for feedback, refine, peer-review, disseminate
6
16. 16
Teachers in feedback + monitoring
schools appeared to perform better
on measures of teaching activity
Difference between feedback + monitoring and comparison
schools on various measures of teaching activity
*Statistically significant difference
0%
2%
4%
6%
8%
10%
12%
14%
16%
17. 17
However, there was no difference in
test scores between students in
treatment and comparison schools
0
0.02
0.04
0.06
0.08
0.1
0.12
Teaching Activity Student Learning
0.107
0.002
EffectSize
Outcomes for treatment schools
relative to comparison schools
The lack of impact on
test scores, despite
enhanced teaching
activity, suggests that
teachers temporarily
changed behavior when
observed, but did not
actively use the
feedback reports in
their teaching.
19. 19
Schools spend most of the grant
on non-durables – similar pattern
in both years
0
2,000
4,000
6,000
8,000
10,000
12,000
Year 1 Year 2
INR
Textbooks Practice books
Classroom materials Child Stationary
Child Durable Materials Sports Goods + Others
• Nearly half the grant allocation
was spent on child stationary
(notebooks, slates, chalks)
• Close to another 40% was
spent on classroom materials
(such as charts, maps and toys)
and practice books (such as
workbooks, exercise books, etc)
• Small amounts were allocated
to durable materials and sports
goods
Average school annual grant
allocation pattern
20. 20
Impact of the program is lower after
2 years than after 1 year
-140
-120
-100
-80
-60
-40
-20
0
Y1 Y2
-40
-138
INR
Change in HH spending in response to
school spending
Unanticipated
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
Y1 Y2
0.088
0.049
EffectSize
Student test scores (normalized)
Household spending fell significantly
when the grant was anticipated
Student learning improved in the first
year, but not the second
Anticipated
22. 22
Contract teachers are significantly
different to regular teachers
Regular
Teachers (RTs)
Contract
Teachers (CTs)
Significantly
different?
Proportion male 63.1% 31.8%
Average age 40.35 25.81
College degree or higher 84.3% 45.5%
Formal teacher training degree or
certificate
98.3% 9.1%
Received any training in last twelve
months
93.5% 54.5%
From the same village 7.2% 81.8%
Distance to school (km) 11.9 1.1
Average salary (Rs./month) 8,698 1,250
CTs are hired by school committees and typically tend to be young females, with no
formal teacher training qualification and from the same village as the school in
which they teach. CTs are paid significantly less than RTs.
23. 23
There have been several concerns
with respect to contract teachers
Two main questions:
1) “What is the impact of an extra CT” hired in a “business as usual” way?
2) How would reducing PTR with a CT compare with doing so with an RT?
3
1
2
• CTs are exploited as a result of being paid significantly less than RTs
• Using untrained and less qualified CT’s will not improve learning
• Decentralizing hiring will lead to local elite capture of the teacher post
24. 24
Not only did extra CTs enhance
student learning, there were found
to be no less effective than RTs
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
One year Two years
0.09
0.141
EffectSize
Students in extra CT schools
significantly outperform students
in comparison schools
Improving student learning from
adding an extra teacher to school
LHS: effect sizes are statistically significant. RHS: difference is not statistically significant.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Extra CT Extra RT
0.32
0.22
EffectSize
27. 27
Performance Pay : Background and
Research Questions
1. Can teacher performance-pay improve test scores?
2. What, if any, are the negative consequences?
3. How do group and individual incentives compare?
4. How does teacher behaviour change in response to the bonuses?
5. Do different types of teachers respond differentially to the bonuses?
6. What is teacher opinion on performance pay?
• Lack of differentiation by performance is a major demotivator for teachers
− Teachers with highest job satisfaction were most absent
• Program was designed to recognise and reward good performance
Motivation
Key questions addressed
28. 28
Potential concerns with such a
program are addressed pro-actively
in the study design
Potential concern How addressed
Teaching to the test
• Test design is such that you cannot do well without deeper
knowledge / understanding
• Less of a concern given extremely low levels of learning
• Research shows that the process of taking a test can enhance learning
Threshold effects/
Neglecting weak kids
• Minimized by making bonus a function of average improvement of all
students, so teachers are not incentivized to focus only on students
near some target;
• Drop outs assigned low scores
Cheating / paper leaks
• Testing done by independent teams from Azim Premji Foundation,
with no connection to the school
Reduction of intrinsic
motivation
• Recognize that framing matters
• Program framed in terms of recognition and reward for outstanding
teaching as opposed to accountability
29. 29
Incentive schools perform better
across the board
Outcomes for bonus schools
relative to control schools
• Students in bonus schools do better for
all major subgroups, including: all five
grades (1-5); both subjects; all five
project districts; and levels of question
difficulty
• No significant difference by most
student demographic variables,
including household literacy, caste ,
gender, and baseline score
• Lack of differential treatment effects is
an indicator of broad-based gains
0
0.05
0.1
0.15
0.2
0.25
Y1 on Y0 Y2 on Y0
0.153
0.217
EffectSize
Overall, almost every child in an incentive school performed significantly better
than comparable children in control schools
30. 30
Incentives have broad-based impact
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
Y1 Y2
0.14
0.17
0.14
0.18
EffectSize
Mechanical Conceptual
Normalized by mechanical / conceptual distribution in control schools
All figures statistically significant
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
Y1 Y2
0.11
0.11
0.14
0.18
EffectSize
Science Social studies
Normalized endline scores grades 3-5 only
All figures statistically significant
True learning: Bonus students perform better on
conceptual, not just mechanical questions
Spillovers: And they also perform better on
non-incentive subjects
31. 31
Individual incentives versus group
incentives
• The theory on group- versus individual-level
incentives is ambiguous
− On the one hand, group incentives may
induce less effort due to free-riding
− On the other, if there are gains to
cooperation, then it is possible that group
incentives might yield better results
• Both group and individual incentive
programs had significantly positive impacts
on test scores in both years
• In the first year, they were equally effective,
but in the second year, the individual
incentives do significantly better
• Both were equally cost-effective
0
0.05
0.1
0.15
0.2
0.25
0.3
Y1 Y2
0.16
0.27
0.15
0.16
EffectSeize
Individual Group
In theory…
Our findings…
32. 32
Teacher absence did not change, but
effort intensity went up
Incentive teachers did no better
under observation…
… But report undertaking various forms of
special preparation
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
Extra
homework
Extra
classwork
Extra
classes
Practice
tests
Focus on
weaker
children
42%
47%
16%
30%
20%20%
23%
5%
14%
7%
Incentive Control
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
Absence Actively teaching
Incentive Control
33. 33
Teacher opinion on performance pay
is overwhelmingly positive
• It is easy to support a program when it
only offers rewards and no penalties
• However, teachers also support
performance pay under an overall
wage-neutral expectation
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
Increased
motivation as a
result of PP
Favorable
opinion of PP
Government
should consider
implementing
PP
75%
85%
67%
Strong teacher support for
performance pay
• Significant positive correlation between
teacher performance and the extent of
performance pay desired beforehand
− Suggests that effective teachers know
who they are and there are likely to be
sorting benefits from performance pay
35. 35
Overall, bonuses condition on
performance had a larger impact than
unconditional provision of inputs…
• Pure incentives (individual
and group bonuses) are most
effective
• The mixed input-incentive
program (contract teachers)
is next most effective
• Pure inputs (block grants and
diagnostic feedback) are
least effective
0
0.05
0.1
0.15
0.2
0.25
0.3
Individual
bonuses
Group
bonuses
Contract
teacher
Block grant Diagnostic
feedback
0.16
0.15
0.09 0.09
0.00
0.27
0.16
0.14
0.05
EffectSize
Combined impact (Maths and Telugu)
Y1 on Y0 Y2 on Y0
37. 37
There are four key policy messages
from our study
1
2
The education system has to focus on learning outcomes
- You get what you measure, and if you want learning you have to measure it
Provide high-quality remedial instruction in early schooling years
- Students start school at different levels and unless you set different bars or
extend number of school years, need remedial education
Focus on teacher performance measurement and management
- Teachers are the highest potential lever at the policymaker’s disposal
- System has to have a meaningful career ladder based on performance
Use contract teachers to focus on remedial education
- Plenty of evidence to support the effectiveness of such programs
- Provide credit for performance/service as a CT during RT selection
3
4
38. 38
Bibliography
• Abhijit Banerjee et al: “Remedying Education: Evidence from Two Randomized Experiments in
India”
• Michael Kremer, Karthik Muralidharan, Nazmul Chaudhury, Jeffrey Hammer, F. Halsey Rogers:
“Teacher Absence in India: A Snapshot”
• Karthik Muralidharan, Michael Kremer: “Private Schools in Rural India: Some Facts”
• Eric Hanushek and Ludger Woessman: “The Role of Education Quality for Economic Growth”
• Jishnu Das and Tristan Zajonc: “India Shining and Bharat Drowning”
• Jishnu Das, Stefan Dercon, James Habyarimana, Pramila Krishnan, Karthik Muralidharan and
Venkatesh Sundararaman: “School Inputs, Household Substitution, and Test Scores”
• Karthik Muralidharan and Venkatesh Sundararaman: “The Impact of Diagnostic Feedback to
Teachers on Student Learning: Experimental Evidence from India”
• Karthik Muralidharan and Venkatesh Sundararaman: “Contract Teachers: Experimental Evidence
from India”
• Karthik Muralidharan and Venkatesh Sundararaman: “Teacher Performance Pay: Experimental
Evidence from India”
• Karthik Muralidharan and Venkatesh Sundararaman: “Teacher Opinions on Performance Pay:
Evidence from India”
Notas do Editor
For DF need to explain that control was also subject to monitoring, but much less
For PP need to touch on diff between group and individual and why we tested both
For all need to touch on motivation in Indian context
For DF need to explain that control was also subject to monitoring, but much less
For PP need to touch on diff between group and individual and why we tested both
For all need to touch on motivation in Indian context
For DF need to explain that control was also subject to monitoring, but much less
For PP need to touch on diff between group and individual and why we tested both
For all need to touch on motivation in Indian context
Not only are pure inputs least effective, they are mostly insignificant, except for block grants when unanticipated.
For DF need to explain that control was also subject to monitoring, but much less
For PP need to touch on diff between group and individual and why we tested both
For all need to touch on motivation in Indian context