2. Topics/Purposes of Presentation
1. Give overview and policy history
2. Explain what went wrong and why it
went wrong
3. Present results of re-analyses that
mitigate issues and correct impact
estimates
4. Discuss next steps and invitation for
more analyses
3. Clarification of What Presentation is
Not
Not a critique of random assignment-recognize power of
method and hope this critique will improve its
application
Not a general critique of Mathematica Policy Research ‗s
work—believe conclusions and reports of ―no impact‖
estimates in their Upward Bound (UB) reports are
seriously flawed; very critical of Mathematica‘s refusal to
acknowledge more robust positive impact estimates and
their misleading masking of key issues with the study in
reports---but respect the hard work and determination of
completing this study
Not an Act of Advocacy for the program —am acting as a
researcher concerned with meeting research standards
4. Personal Involvement
Disclosure
Employed as Contractor for over 25 years:
Westat for 16 years and served as Project Director (PD) for National
Evaluation of Student Support Services (SSS) evaluation.
Mathematica for 6 years served as PD for National Evaluation of
Talent Search—While employed at Mathematica also served as Survey
Director for UB Third and start of Fourth follow up data collection
RTI for 3 years served as NSOPF PD
UB study began in 1992--Controversial Study over entire history—random
assignment combined with probability national sample—very rare.
Mathematica published 4 reports (two most recent 2004 & 2009)
I joined US Department of Education (ED), Policy and Planning Studies
Services (PPSS) in late 2004 ---Team Leader for Secondary Postsecondary
Cross-Cutting (SPCC) Team---UB study was under my team.
Developed concerns—Involved in long painful internal debate-- 2006-2011;
Retired from ED in 2011
Currently Co-Principal Investigator for ED i3-grant—Using Data to Inform
College Access Programming at Pell Institute for Study of Higher Education at
Council for Opportunity in Education (COE)
5. Basic Problem
As final ED COR/Technical Monitor found impact
estimates published in 2004 and again in 2009 were
seriously flawed such that the conclusions of ―no
detectable impact‖ for UB program were found to be
erroneous
Re-analyses correcting for these errors using
standard statistical procedures found strong positive
results for the UB program on major outcomes
Report is not transparent in revealing these issues or
the findings of positive results when these issues are
addressed
6. Upward Bound (UB)
Program Overview
UB begun in 1965 as part of civil rights movement and
New Society: 1991—Upward Bound Math Science
(UBMS) initiative begun
Goal –increase college access and preparation for eligible
high school students (low-income (150 percent of poverty)
and first generation college (no parent has BA degree)
Academic focus—6-to 8 week program on college campus
in summer and academic year follow-up sessions
Most intensive of TRIO programs--$4900 per year per
student served; Average program serves 50 students per
year
Grants made to postsecondary institutions to run
programs—often students enroll in institutions---
currently over 1000 programs across nation
7. Percentage of high school students who had at least one parent
with a four-year college degree by race/ethnicity: 1972, 1980, 1990
and 2002: NCES High School Longitudinal Studies
60
Note large
50
52 increase
45 since
43
40 40
38
program
29
began in
30 31
26
29 percent of
23 27
20 22
23
21 parents
21 16 having BA
14 15
13
10 11 13 14 degree
8 7
0
1970 1975 1980 1985 1990 1995 2000 2005
White Hispanic or Latino
Black or African American Asian
American Indian or Alaska Native All
The Pell Institute 7
8. UB Evaluation: Study
History
Second national evaluation and
first random assignment study of
UB: Begun in 1992 –last follow-up
in 2003-04
Under 3 contracts Mathematica has
authored 4 reports published by ED
1996, 1999, 2004, 2009; Fourth
follow up report unpublished
9. UB Study Basic Design
Unique combination
Multi-stage complex nationally representative probability sampling
procedures –inverse probability of selection weighted to national estimates
Experimental random assignment design
Multi-stage sample design
67 projects from 46 strata designed to represent different types of projects
(4-2year, public-private, small, med, large, rural, non-rural, race/ethnicity
of participants)
339 end stage strata for 1500 treatment and 1380 control applicants
Projects required to recruit at least twice number of openings so can do
random assignment
Study sought to change as little as possible about the program except
recruitment
Accommodations—allowed ―must serves‖ removed from analyses
Did not control actual offering of treatment or participation of those
assigned
Multi-grade—multi-year cohort—grades 7 to 10 at baseline
10. Flawed reports authored by Mathematica Policy
Research have driven ED Policy with regard to UB
program for more than a Decade
Third Follow up--- reported no average overall effects; but large
effects for students at-risk academically and with lower
educational expectations defined as expecting less than a BA at
baseline
The Program Assessment Rating Tool (PART) was developed to
assess and improve program performance so that the Federal
government can achieve better results ----UB given OMB PART
rating of ―ineffective‖
Based on study findings --ED began new UB Initiative to serve
more academically at risk students
Budget ---Bush budget zero funding of all federal pre-college
programs (UB, UBMS, Talent Search and Gear Up) in FY05 and
FY06—Justified by UB study results--dropped in FY07 and FY08
11. Policy History (cont)
UB 2006 Absolute Priority to serve 1/3 at-risk and 9th
grade ;
New random assignment study to evaluate begun 2006
Congress blocked in 2007 and cancelled by ED in 2008
HEOA 2008
Mandates rigorous evaluations
Prohibits over-recruitment to program only for for the
purposes of evaluation random assignment –does not
prohibit any random assignment studies only when is
deliberate denial of services
Absolute Priority cancelled
12. Impact Estimates Reported by
Mathematica and on ED Website
have:
Inadequately controlled for bias in favor of control group
Serious representational issues for largest 4-year public
stratum
Severe unequal weighting with one project given 26 percent
of weight
Lack of standardization of outcome measures to expected
high school graduation year for sample that spanned 5
years of expected high school graduation year
Inappropriate use of National Student Clearinghouse (NSC)
data when coverage was too low to meet standards or non-
existent and there is evidence of bias
13. Other Researchers Have
Confirmed Issues
Initial concern came in 2005 from Mathematica itself when a new staff person no
longer employed there who was lead analyst from Fourth Follow up sent ED tables
showing results were sensitive to only one project– revealed for first time that one
project had 26 percent of weight; seemingly large negative impacts---Positive overall
impacts when excluded; not significant impacts when included
PPSS Consultation with RTI—statistical experts—James Chromy—Fellow of
American Statistical Society --sent file in 2007 and he advised on how to handle
project 69—treat as ineligible ---and replicated statistical tabulations using
SUDANN—asked for sample frame –Mathematica delayed in sending
David Goodwin -Division Director who was original COR for UB study and who
originally defended the impact estimates eventually came to see the problems and
believe that analyses without project 69 were more credible
IES external reviews confirmed basic issues—stated results with project 69 were not
robust
When present information academic discussants and audiences are incredulous do
not understand why ED would continue to publish these impacts
14. Guidance from three intersecting
traditions
Experimental design work examining the threats to
validity (for example, Shadish, Cook, and Campbell; Heckman)
Survey methods research on —sampling and non-
sampling error (for example, Groves, et. al 2004)
Statistical and program evaluation standards (for
example, the Program Evaluation Standards, NCES Standards,
AERA Standards ).
15. What is Sampling and Non-
Sampling Error?
Sampling error is the error caused by observing a
sample instead of the whole population. Sample to
sample variation estimated by observing variation
among the sample members or sub-dividing the
sample
Non-sampling error is a catch all term for deviations
from true value of estimates or study error that is not
caused by sampling (examples non-response bias,
lack of understanding of questions, lack of recall)—
harder to measure statistically
16. Basic Assumptions of Random
Assignment Studies
1. Sample representative of population to which wish
to generalize
2. Treatment and control group are equivalent
3. Treatment and control group treated equally except
for the treatment
4. Treatment and control group are mutually
exclusive with regard to the treatment
17. Request for Correction
Covers
Major Focus on the Technical Standards Violations in report
Also covers
Transparency issues in the report (does not provide
information needed to judge and also masks some of the
issues)
Review process issues—In politically directed process the
report was published over the objections of unit responsible for
the study (the PPSS Team Leader and Technical Reviewers)
and over the Office of Postsecondary Education (OPE) formal
disapproval in last week of Bush Administration
Note: It was published with the reported acquiescence of IES even
though an IES external reviewer had specifically stated that the
―impact estimates were not robust‖
18. REPORTS HAVE 6 MAJOR
STANDARDS VIOLATIONS
1.
Seriously flawed sample design—one project of 67 carrying 26 percent of weight—only one
single project selected from largest study defined stratum (some cases weighted up to 200
times weights of other students)
2. Serious representational issues for project with 26 percent of weight –was atypical for its 4-
year stratum in that had mostly 2-year and less than 2-year certificate programs
3. Treatment and control group that has bias in favor of the control group ----were seriously
non-equivalent
4. Outcome variables were not standardized to expected high school graduation year
(EHSGY) for sample that spanned 5 years of graduation dates
5. Improper use of National Student Clearinghouse data for non-responders to surveys when
coverage was too low or non-existent and evidence of bias
6. Lack of transparency in acknowledging issues and masking some of issues—biased
reporting of findings—lack of acknowledgement of alternative credible positive findings
for Upward Bound
19. 1. Sample Design Issues
Sample highly stratified—46 for 67 projects
Unequal weighting---One project carries 26 percent, 3
projects 35, and 8 projects 50 percent of weight
Project level stratification—339—strata unequal
within projects
Basic Design Flaw--One project for largest
Treatment-control non-equivalency introduced by
outlier 26 percent project
20. Project that should have been declared
ineligible to represent its 4-year stratum carried
26 percent of the weight
Extreme unequal weighting
Figure 5. Percent of sum of the weights by project of the 67 projects making up the and serious representation
Upward Bound national evaluation sample: study conducted 1992-93-
2003-04 issues
30
One project of 67 in sample
carried 26 percent of weight
26.38
25
20
(known as 69) and was sole
15 Percent of weight representative of the largest 4-
10 year public strata, but was a
5 former 2-year school with
0
largely less than 2-year
programs
1
3
6
8
0
2
4
7
9
2
4
6
8
0
3
5
7
9
1
4
6
8
0
2
4
7
9
1
3
5
7
9
1
P1
P1
P1
P1
P2
P2
P2
P2
P2
P3
P3
P3
P3
P4
P4
P4
P4
P4
P5
P5
P5
P5
P6
P6
P6
P6
P6
P7
P7
P7
P7
P7
P8
NOTE: Of the 67 projects making up the UB sample just over half (54 percent) have less than 1 percent of the weights each and one
project (69) accounts for 26.4 percent of the weights.
Project partnered with job
SOURCE: Data tabulated (December 2007) by Policy and Program Studies Service (PPSS) of Office of Planning, Evaluation and Policy
Development (OPEPD) US Department of Education (ED) using national evaluation of Upward Bound data files: study conducted 1992 -
93-2003-04.
training program
Inadequate representation of
4-year stratum
21. 2. Treatment–Control Non-
Equivalency
Sample well matched without project 69
Project 69 introduces bias into the overall sample
in favor of the controls
Project 69 has large differences (examples)
Education expectations: 56 percent of controls expect
advanced degree—15 percent treatment
9th grade academics—8 percent controls are at risk; 33
percent of treatment group are at risk
Expected HS grad is 1997 (younger group)—60
percent of treatment and 42 percent of controls
22. Project 69 had seriously non-
equivalent treatment and
control group
No69Treatment No69Control 69Treatment 69Control
100
90
80
70
60
50
40
30
20
10
0
Male Expect MA or Base grade 8 Algebra in 9th High academic GPA below 2.5 White
higher or below risk
23. Bias in 69 and balance in rest of sample
taken together
Project 69 66 projects in sample
Other
100
100
Control, 20 Control, 23 90
90
80 Control, 49 Control, 49 Control, 51
80 70
70 60
Control, 79
60 50
50 40
Treatment, 80 Treatment, 77 30 Treatment, 51 Treatment, 51 Treatment, 49
40
20
30
10
20 0
Treatment, 21 High academic In 9th (younger) Expect advanced
10
risk grade in 1993-94 degree
0
Treatment Control
High academic In 9th (younger) Expect advanced
risk grade in 1993-94 degree
Treatment Control
The Pell Institute 23
24. 100
90
Control, 42 Control, 44
80
Control, 58
70
60
50
40
Treatment, 58 Treatment, 56
30
Treatment, 42
20
10
0
High academic In 9th (younger) Expect advanced
risk grade in 1993-94 degree
Treatment Control
The Pell Institute 24
25. 3. Lack of Outcome Standardization to Expected
High School Graduation Year (EHSGY)
Multi-grade study cohort spanned 5 years of expected
high school graduation
At the time of the last (5th) follow-up 10 percent had 6
years, 30 percent had 7 years; 34 percent had 8 years; 19
percent had 9 years; and 5 percent had 10 years since
high school graduation
Unbalances between treatment and control ---Control
has larger percentage of older 10th grade students at
time of randomization
Mathematica never standardized outcome measures
based on EHSGY; ED staff derived these variables for
re-analysis
26. 4. Survey Attrition and Non-
Response and Non-Coverage Bias
Concern in longitudinal studies
UB rates very high for follow ups but at 74 percent by
end—control group 4-5 percent less response rate --on
Third and Fourth
Positive outcomes more likely to respond
Use federal aid files to observe and impute
Improper use of National Student Clearinghouse for
non-respondents when enrollment coverage too low and
biased due to clustering; and when 2-year and less than
2-year was non-existent in most applicable period
27. Figure 4. Percent of total UB study participants found on the federal financial aid files as applicants
and as Pell recipients, classified by fourth follow–up survey response status: study
conducted 1992-93-2003-04
62
Applied for aid
79
47
Pell recipient
63
0 10 20 30 40 50 60 70 80 90
Responder Non-responder
NOTE: Unweighted data based on 2845 Upward Bound sample members from both treatment and control groups
SOURCE: Data tabulated (October 2006) by Policy and Program Studies Service (PPSS) of Office of Planning, Evaluation and Policy
Development (OPEPD) US Department of Education (ED) using national evaluation of Upward Bound data files and Federal Applicant
and Award Files 1994-95 to 2003-04
28. 5. Service Participation and
non-Participation Issues
Waiting List Drop-Outs --26 percent of
treatment coded as waiting list file drop-outs—
kept in treatment sample
First Follow-up survey 18% non-participation in
neither UB or UBMS in treatment group
Survey data--12-14 percent controls evidence of
UB or UBMS participation
60 percent controls and 92 percent treatment
group reported some pre-college supplemental
service participation
29. 6. Masking of Issues in
Final Report
Failure to report on project 69‘s representational issues
Failure to acknowledge large impacts without project 69
and stating that exclusion of project 69 does not make a
difference in conclusions
Failure to acknowledge NSC coverage and bias issues
Failure to acknowledge standardization of outcomes
results and misleading statements concerning results
Failure to acknowledge the extent of academic risk bias in
favor of the control group in estimates
30. Alternative Re-Analyses
Experimental Analyses
Intent to treat (ITT)—UB opportunity--original
random assignment groups—Logistic regression
Treatment on Treated (TOT) -UB/UBMS
participation—Instrumental Variables Regression
Quasi-experimental--Observational
UB/UBMS compared to non-UB/non-UBMS service
Any service compared to no service
Selected subgroup (academic risk-and
educational expectations)
31. Instrumental Variables Regression used in
TOT/CACE and Observational analyses
Two stage regression—mitigate
selection bias
First stage models factors
related to participation
Second stage --uses results as
additional control in the model
estimating outcomes
32. What is the same as
Mathematica‘s Analyses?
Use same statistical methods (logistic and
instrumental variables regression)
Statistical programs that take into account the
complex multi-stage sample design in estimating
standard errors--STATA
Same ITT opportunity grouping: TOT participation
grouping recognizes UBMS as form of UB
Similar model baseline controls: both omit 9th grade
academic risk indicators; include additional control
for grade at baseline
Same weights--Mathematica
33. What is Different from
Mathematica‘s analyses
Standardize outcomes by expected high school
graduation year
Avoid using early NSC data when coverage too low; use
only for BA degree as supplement for non-responders to
surveys
Use all applicable follow-up surveys (3 to 5) not just one
round at a time; used federal aid files
Present data with and without project 69 and weighted
and unweighted;
View impact estimates without project 69 as reasonably
robust for 74 percent of applicants; view estimates with
project 69 as non-robust and use should be avoided
especially for estimates of BA impact
34. Re-analyses Findings for
Enrollment and Financial aid
Standardizing for Expected High
School Graduation Year (and not
using NSC data for enrollment)
found significant and substantial
positive ITT and TOT findings
weighted and unweighted and
with and without project 69
35. Overall Results
Significant and substantial positive ITT and
TOT findings weighted and unweighted and
with and without project 69 for:
Evidence of postsecondary entrance in +18 months
and for +4 years
Application for financial aid in +18 months and for
+4 years
Evidence of award of any postsecondary degree or
credential by fourth follow up (4 to 6 years after
EHSGY)
36. Figure 1. Estimated rates of postsecondary entrance within +1 (about 18 months) of expected high
school graduation year (EHSGY for Upward Bound Opportunity (ITT) and Upward
Bound/Upward Bound Math Science Participation (TOT/CACE): study conducted 1992-
93-2003-04
ITT evidence of
postsecondary within 66 Difference
+1 of EHSGY 72.9 6.9****
(includes outlier)
TOT/CACE evidence
of postsecondary 62.5
within +1 of EHSGY 73.5 Difference
(includes outlier) 10.9****
Control
Treatment
ITT evidence of Difference
postsecondary within 64.3
9.1***
+1 of EHSGY 73.3
(excludes outlier)
TOT/CACE evidence
60.4 Difference
of postsecondary
within +1 of EHSGY 14.2****
74.6
(excludes outlier)
40 45 50 55 60 65 70 75 80
*/**/***/**** Significant at 0.10/0.05/. 01/00 level; UB = regular Upward Bound; UBMS = Upward Bound Math Science; ITT = Intent
to Treat; TOT= Treatment on Treated; CACE = Complier Average Causal Effect.
NOTE: Estimated rates from STATA logistic and instrumental variables regression taking into account the complex sample design.
Weighted estimates use poststratified weights. See table 4 in body of the report for detailed not e.
SOURCE: Data tabulated (January 2008) Policy and Program Studies Service (PPSS) of Office of Planning, Evaluation and Policy
Development (OPEPD) US Department of Education (ED) using national evaluation of Upward Bound data files: study conducted 1992 -
93-2003-04; and Federal Aid Application and Pell Award Files 1994-95 to 2003-04.
37.
38. Figure 2. Estimated rates of application for federal financial aid within +4 of expected high school
graduation year (EHSGY) for Upward Bound Opportunity (ITT) and Upward
Bound/Upward Bound Math Science Participation (TOT/CACE): study conducted 1992-
93-2003-04
ITT applied for federal
financial aid within +4 58.7
Difference
of EHSGY (includes 65.4 6.7****
outlier)
TOT/CACE applied for
federal financial aid 56.1
within +4 of EHSGY 66.7
Difference
(includes outlier) 10.6****
Control
Treatment
ITT applied for federal Difference
financial aid within +4 60.4
7.3***
of EHSGY (excludes 67.7
outlier)
TOT/CACE applied for
federal financial aid 57.1
within +4 of EHSGY
Difference
69.1
(excludes outlier) 11.9****
40 45 50 55 60 65 70 75 80
*/**/***/**** Significant at 0.10/0.05/. 01/00 level; UB = regular Upward Bound; UBMS = Upward Boun d Math Science; ITT = Intent
to Treat; TOT= Treatment on Treated; CACE = Complier Average Causal Effect.
NOTE: Estimated rates from STATA logistic and instrumental variables regression taking into account the complex sample design.
Weighted data use poststratified weights. See table 6 and table 4 in body of the report for detailed notes.
SOURCE: Data tabulated (January 2008) Policy and Program Studies Service (PPSS) of Office of Planning, Evaluation and Policy
Development (OPEPD) US Department of Education (ED) using national evaluation of Upward Bound data files: study conducted 1992-
93-2003-04; and Federal Aid Application and Pell Award Files 1994-95 to 2003-04.
39. Re-Analyses--Awarded a BA in
+6 years of EHSGY
Weighted with 69 not sign. Unweighted sign.
For the 74 percent of sample not represented by
project 69
28 percent increase in BA award for
ITT UB opportunity (13.3 increased
to 17.0)
50 percent increase in BA award for
TOT UB participation analyses (14.1
to increased to 21.1)
40. Impact of Upward Bound (UB) on
Bachelor’s (BA) degree attainment
NOTE: Instrumental Variables Regression models
for Treatment on the Treated (TOT) estimates
based on 66 of 67 projects in UB sample: National
Evaluation of Upward Bound, study conducted
1992-93 to 2003-04
EHSGY = Expected High School Graduation Year;
NSC = National Student Clearinghouse; SFA =
Student Financial Aid All estimates significant at
the .01 level or higher. Estimates based on 66 of 67
projects in sample representing 74 percent of UB at
the time of the study. One project removed due to
introducing bias into estimates and representational
issues. We use a 2-stage instrumental variables
regression procedure to control for selection effects
for the Treatment on the Treated (TOT) impact
estimates.
SOURCE: Data tabulated January 2010 using:
National Evaluation of Upward Bound data files,
study sponsored by the Policy and Program Studies
Services (PPSS), of the Office of Planning,
Evaluation and Policy Development (OPEPD), U.S.
Department of Education; study conducted 1992-9
to -2003-04.
41. UB/UBMS Participation Compared
with Other non-UB/UBMS Services
Participation
Quasi-experimental--Uses 2-stage instrumental
variables regression—controls for selection
bias not eliminate
Found statistically significant and substantive
positive results for UB/UBMS participation for:
Evidence of postsecondary entrance +1 and +4
Application for financial aid +1 and +4
Award of BA in +6 unweighted overall and
unweighted and weighted without project 69
42. Table 5. Evidence of Postsecondary Entrance within +1 (18 months) and within +4 of expected high
school graduation year (EHSGY for observational models comparing types of service receipt:
National Evaluation of Upward Bound, study conducted 1992-93 to 2003-2004
All sampling strata One outlier project removed (remainder
represents 74 percent of Horizons waiting list)
Outcome Participated in Any pre-college Participated in Any pre-college
variable UB/UBMS compared support or UB/UBMS compared support or
with participated in supplemental services with participated in supplemental services
other non-UB/non-UBMS reported compared other non-UB/non-UBMS reported compared
pre-college support or with no services pre-college support or with no services
supplemental services reported supplemental services reported
only (observational – (observational – only (observational – (observational –
instrumental variables instrumental variables instrumental variables instrumental variables
regression) regression) regression) regression)
Evidence of xb T = 74.4 xb-T = 73.5 xb T = 75.0 xb T = 74.3
postsecondary xb C = 65.3 xbC = 48.6 xb C = 61.7 xb C = 44.6
entrance within Difference = 9.1*** Difference = 25.0**** Difference = 13.3**** Difference = 29.8****
+1 of EHSGY
(xb T = 75.8 (xb T = 75.9
(xb T = 76.2 xb C = 51.7 (xb T = 76.3 xb C = 51.1
xb C = 66.8 Difference = xb C = 66.3 Difference =
Difference = 9.3****) 24.1****) Difference = 10.1****) 24.7****)
Evidence of xb T = 75.6 xb-T = 74.8 xb T = 76.5 xb T = 75.9
postsecondary xb C = 67.5 xb-C = 51.4 xb C = 64.4 xb C = 47.8
entrance within Difference = 8.2*** Difference = 23.5*** Difference = 12.1**** Difference = 28.1****
+4 EHSGY
(xb T = 78.2 (xb T = 77.7 (xb T = 78.4 (xb T = 77.8
xb C = 68.7 xb C = 54.1 xb C = 68.2 xb C = 53.7
Difference = 9.5****) Difference = Difference = 10.2****) Difference =
23.6****) 24.1****)
*/**/***/**** Significant at 0.10/0.05/.01/00 level
UB = regular Upward Bound; UBMS = Upward Bound Math Science; T = Treatment; C = Control or comparison; xb = linear prediction from STATA
ivreg instrumental variables regression. Odds ratio = prT(1-prC)/prC(1-prT).
NOTE: Unweighted data given in parentheses. Please see table 4 for detailed notes.
SOURCE: Data tabulated (January 2008) by Policy and Planning Studies Services (PPSS) using data from the, National Evaluation of Upward Bound,
study files baseline through 4th follow up and Federal Aid Application and Pell Award Files 1994-95 to 2003-04.
43. Sub-Group Analyses
Bottom 20 percent on academic indicators
Large positive significant effects for:
Postsecondary entrance
Application for financial aid
Award of any postsecondary degree
Not for BA degree –two few achieved to compare
treatment and control
Top 80 percent on academic indicators
Moderate positive significant effects for:
Postsecondary entrance
Application for financial aid
Award of any postsecondary degree
For BA degree in +6
44.
45. Impact Estimates from Two Stage Instrumental Variables Regression for
Percent Obtaining a BA in +6 years based on UB Random Assignment
Evaluation
Difference 7.0 ****
14.1% 50% increase
UB/UBMS participation: Treatment on the
Treated(TOT/CACE) (outlier removed)
21.1%
15.2%
Difference 5.8***
UB/UBMS compared with other non-UB/UBMS 39 %increase
service only (outlier removed)
21.0%
Difference 14.4***
Any pre-college with academic component 6.5% 223% increase
compared with no pre-college service
reported (outlier removed) 20.9%
0.0% 5.0% 10.0% 15.0% 20.0% 25.0%
Comparison Treatment
Note: All estimates significant at the .01 level or higher. Estimates based on 66 of 67 projects in
sample representing 74 percent of UB at the time of the study. One project removed due to
introducing bias into estimates and representational issues.
46. Random Assignment National Evaluation of Upward Bound (UB) Data
on Estimated increase in life-time taxes paid compared to program cost
per participant—taxes are 4.9 to 5.9 times the cost of participation
Sources and Assumptions:
*UB Evaluation Data. Estimated based on estimated
differences in educational attainment between the
treatment and control group from random assignment
study that followed sample for 6 to 10 years after
expected high school graduation. $41, 495 figure based on
impact estimates from the final Fifth Follow up Survey
using outcome variables derived by Mathematica Policy
Research with weights adjusted for survey non-
response. $36,493 estimates based on outcomes variables
for longitudinal file standardized by expected high school
graduation date Treated on the Treated (TOT) estimates
based on instrumental variables regression modeling for
66 of the 67 projects representing 74 percent of the
sample. One project of 67 in the sample excluded due to
fact that was found to be ineligible to represent its
stratum and also had large imbalances between treatment
and control group that due to extreme weight that
introduced bias into previously published overall
estimates.
*Life time earnings and taxes data from US Census
Bureau; The Big Payoff: Educational Attainment and
Synthetic estimates of Work-Life Earnings, July
2002, Current Population Reports Jennifer Day and Eric
Newburger; College Board , Education Pays, The Benefits of
Higher Education for Individuals and Society: 2007
**Cost of UB program per participant: US Department of
Education Data on average cost of UB for one year --$4900
Assumes average participant uses about 1.5 times this
level of resources.
47. Support for Timely Review
Correction Request will be needed
Ways to Support Request for Correction
Public statement of fact that submitting and reasons
Statement requesting timely review by ED signed by
stakeholders and evaluators
Holding panels discussing the issues at major
education and evaluation associations (wider issues of
evaluation methods and use and transparency)
Accountability of the evaluator contractors and ed.
issues
48. How could problems have been avoided in
first place? Follow existing standards!
Caution about trying to do too much---Chose a difficult and atypical design
combining probability sampling with experimental design---led to serious
issues—made worse by mistakes made and general lack of awareness of
sampling and non-sampling study errors and role in impact estimation
Sample design flawed from start with serious unequal weighting—follow
established standards for sample design
Representation issues—contractor did not adequately check representation
of stratum and did not fully reveal issues when discovered
Lack of care in analysis in outcome measures that were not standardized to
expected high school graduation which spanned 5 years
Lack of checking treatment and control group balance--equivalency on key
attributes—faith in random assignment to ensure
Failure to respect stakeholder concerns about control group contamination
and other issues and technical monitor legitimate concerns about the
representation and treatment-control group non-balance bias issues ----
repeatedly dismissed as non-objective advocates
49. Serious Problems with Doing
Nothing about Report
1. ED continues to officially misrepresent the impact of UB
2. The UB program reputation continues to be hurt by the
evaluation and stakeholders have officially objected;
could have serious consequences in Congress
3. Missed opportunity to build on the program‘s successes
and find ways to strengthen and adapt program to
achieve nations goals of increased postsecondary access
and completion
4. Evaluation research as a whole suffers from not
correcting mistakes made and learning from them
50. How to Correct Report?
It is correctable and can
provide useful information
Not try to represent entire population of interest
with study (remove project 69 and represent 74
percent)—IES reviewer stated that estimates are
robust for other 66 projects taken together
Standardize outcomes to expected high school
graduation year
Use NSC data only for BA degree and not for less
than BA and not for postsecondary entrance
51. Next Steps in Evaluation
Partnership model among stakeholders
Use more innovative evaluation methods
(collaborative, participatory, empowerment, utilizati
on, systems analysis, culturally responsive
evaluation)
Utilized resources/leverage academic institutional
research offices of grantees
Focus on program improvement rather than up or
down
Open and transparent sharing
Build capacity for self evaluation and accountability
Utilization of standards for statistical research and
program evaluation
52. Invitation to Research & Further
Additional Information
The full text of the COE Request for Correction can be
found at http://www.coenet.us/files/spotlight-
COE_Request_for_Correction_of_Mathematica_Report_0
11812.pdf
Statement of concern by leading researchers in field
http://www.coenet.us/files/spotlight-
Statement_of_Concern_011812.pdf
Results of the re-analysis detailing study error issues can
be found at: http://www.coenet.us/files/files-
Do_the_Conclusions_Change_2009.pdf.
Information on obtaining the restricted use UB data files
for additional research can be obtained by contacting:
Sandra.Furey@ed.gov
53. Contact Information
Margaret.Cahalan@pellinstitute.org
202-347-7430 ex 212
301-642-4851