O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
• What Went Wrong with the Random Assignment National 
Evaluation of Upward Bound? David Goodwin, Retired US 
Department o...
Before Start Would Like to State-- What our 
presentation is not! 
Not a critique of random assignment-recognize 
power o...
Who Are We and Why are We Are Speaking Out Again! 
 We are former COTRs whose job it was be responsible for 
the technica...
UB Evaluation: Study History 
Second national evaluation and first random 
assignment study of UB: 
Begun in 1992. Ran f...
UB Study Basic Design 
Unique Over Ambitious Combination 
 Multi-stage complex nationally representative 
probability sa...
Basic Finding of QA Analyses 
 As US-ED study monitors, in the last of the three 
contracts with PPSS– we gradually found...
8 Major Errors Found in PPSS QA 
Review of Contractor Reports 
1. Flawed sample design 
2. Severe lack of sample represent...
Flawed Sample Design --Extreme unequal 
weighting and serious representation issues 
Project with 26 percent of 
weight (...
Severe non-equivalency in project 69 in favor of control group— 
suspected random assignment may have broken down---explai...
In project 69—Treatment group more likely to be on Track for 
Certificates; Control group on track for advanced degrees an...
Uncorrected Bias in Favor of the Control Group in All of Mathematica Impact 
Estimates-- Project 69’s non-equivalent treat...
Among other 66 projects taken together there is balance one expects in 
random assignment study
Re-analysis to Mitigate Problems and Present More 
Robust Analysis that Reduced Identified Sources of 
Bias 
 I (Cahalan)...
What is the same as Mathematica’s 
Analyses? 
Use same statistical methods 
Statistical programs that take into account ...
What is Different from 
Mathematica’s analyses 
 Standardize survey data outcomes and 10 years of federal aid 
data outco...
Impact on postsecondary enrollment when outcome measures 
are standardized to expected high school graduation year and 
wh...
Impact on Award of Any Postsecondary Degree or Credential by 
End of Study Period: Fifth Follow-up Data: Mathematica and 
...
Impact on BA degree for 66 of the 67 projects that did not have 
representation issue and severe lack of balance in treatm...
Control group Alternative Service and 
Treatment-- Waiting List Drop-outs 
Waiting List Drop-Outs --26 % of treatment 
gr...
Instrumental Variables Regression used in 
TOT/CACE and Observational analyses 
Two stage regression—mitigate 
selection ...
Two Stage Instrumental Variables regression impact results on 
entry into postsecondary in year after expected high school...
Two stage instrumental variables regression impact results on 
BA receipt in 6 years after expected high school graduation...
Cost of UB and Estimated Impact on Life Time Taxes 
Paid Based on National Evaluation of Upward Bound
Conclusions 
 Mathematica contractor conclusions of “no detectable impact” are not 
robust and are seriously flawed. The ...
Further Additional Information 
 The full text of the COE Request for Correction can be found at 
http:// 
www.coenet.us/...
Errors Found in National Evaluation of UpwardBound- Postive Re-Analysis Results
Próximos SlideShares
Carregando em…5
×

Errors Found in National Evaluation of UpwardBound- Postive Re-Analysis Results

308 visualizações

Publicada em

Presentation to Council for Opportunity in Education (COE) documents errors in National Evaluation of Upward Bound reports. Eight major errors are identified. Results summarized from re-analysis correcting for sampling and non-sampling errors that found strong positive impacts for the federal TRIO program.

Publicada em: Educação
  • Seja o primeiro a comentar

  • Seja a primeira pessoa a gostar disto

Errors Found in National Evaluation of UpwardBound- Postive Re-Analysis Results

  1. 1. • What Went Wrong with the Random Assignment National Evaluation of Upward Bound? David Goodwin, Retired US Department of Education (20 minutes) • Findings from the ED-PPSS Staff Re-analysis and a New Cost Benefit Analysis of the National Evaluation of Upward Bound Data, Maggie Cahalan, The Pell Institute (25 minutes) • Discussion of Lessons Learned for the Next Generation of Evaluation Studies. Questions and Discussion by attenders (20 minutes)
  2. 2. Before Start Would Like to State-- What our presentation is not! Not a critique of random assignment-recognize power of method and hope this critique will improve its application Not an Act of Advocacy for the program —we are acting as researchers concerned with meeting professional research standards Not a dismissal of the UB study as a whole— When corrected can give useful information
  3. 3. Who Are We and Why are We Are Speaking Out Again!  We are former COTRs whose job it was be responsible for the technical monitoring of the study  Reports large policy influence—resulted in OMB PART ineffective rating and also zero funding requests in FY2005 and FY2006 President’s budgets for all federal college access programs (UB, UBMS, Talent Search, GEAR UP)  We made our concerns well known in the Department in 2008—report published over PPSS technical staff objections—final report ordered published by departing political staff in January 2009  Flawed reports continue to be cited and used to the detriment of the program (Whitehurst Congressional testimony 2011, Haskins and Rouse, 2013, Decker, 2013)
  4. 4. UB Evaluation: Study History Second national evaluation and first random assignment study of UB: Begun in 1992. Ran for 16 years Under 3 contracts Four Mathematica Policy Research (Mathematica) contractor reports published by ED 1996, 1999, 2004, 2009 Large influence on policy
  5. 5. UB Study Basic Design Unique Over Ambitious Combination  Multi-stage complex nationally representative probability sampling procedures Random assignment design for selection; could not control treatment--- control group service participation Multi-stage sample design  67 projects ---1500 treatment and 1380 control student “applicants”—baseline survey completers interested in UB program Multi-grade—multi-year cohort—grades 7 to 11 at baseline
  6. 6. Basic Finding of QA Analyses  As US-ED study monitors, in the last of the three contracts with PPSS– we gradually found contractor impact estimates of no overall impact were seriously flawed We did a re-analyses correcting for these errors found strong positive results for the UB program on major outcomes  Contractor Reports are not transparent in revealing these issues or the findings of positive results
  7. 7. 8 Major Errors Found in PPSS QA Review of Contractor Reports 1. Flawed sample design 2. Severe lack of sample representation for 4-year public 3. Lack of equivalent treatment and control group with systematic statistical bias in favor of control group– 4. Lack of Common-Outcome Measures-use of unstandardized outcome measures for a sample that spanned 5 years of expected high school graduation years. 5. Biased and improper imputation of survey non-respondents’ outcome measures from data with lack of sufficient coverage at the time ---improper use of National Student Clearinghouse data 6. False Attribution—Attributing negative impacts in project 69 to below average performance when negative impacts demonstrated to be related to treatment-control group non-equivalency 7. Failure to address equivalent services issues –failure to address control group contamination issues 8. Reports Lack of Transparency in acknowledging positive impacts detected when issues addressed such as standardizing outcomes to expected high school graduation year -
  8. 8. Flawed Sample Design --Extreme unequal weighting and serious representation issues Project with 26 percent of weight (known as 69) was sole representative of 4- year public strata grantees, but was a former 2-year school with historical emphasis on certificates and was atypical program for its strata Project partnered with job training program Inadequate representation of 4-year Figure 1. Percentage of sum of the weights by project of the 67 projects making up the study sample: National Evaluation of Upward Bound, study conducted 1992-93-2003-04 NOTE: Of the 67 projects making up the UB sample just over half (54 percent) have less than 1 percent of the weights each and one project (69) accounts for 26.4 percent of the weights. SOURCE: Data tabulated December 2007 using: National Evaluation of Upward Bound data files, study sponsored by the Policy and Planning Studies Services (PPSS), of the Office of Planning, Evaluation and Policy Development (OPEPD), US Department of Education,: study conducted 1992-93-2003-04.
  9. 9. Severe non-equivalency in project 69 in favor of control group— suspected random assignment may have broken down---explains observed negative results from project 69
  10. 10. In project 69—Treatment group more likely to be on Track for Certificates; Control group on track for advanced degrees and UBMS
  11. 11. Uncorrected Bias in Favor of the Control Group in All of Mathematica Impact Estimates-- Project 69’s non-equivalent treatment aanndd ccoonnttrrooll ggrroouupp ccoommbbiinneedd wwiitthh llaarrggee wweeiigghhtt lleedd ttoo llaacckk ooff bbaallaannccee iinn oovveerraallll UUBB SSaammppllee
  12. 12. Among other 66 projects taken together there is balance one expects in random assignment study
  13. 13. Re-analysis to Mitigate Problems and Present More Robust Analysis that Reduced Identified Sources of Bias  I (Cahalan) was personally influenced by work as contractor  Experimental design work examining the threats to validity Survey methods research —NCES and NSF required survey evaluation studies looking at sampling and non-sampling error Statistical and program evaluation standards Reason we are here is because we, as the Technical Monitors whose job it was to ensure technical quality found very different conclusions about UB program than Mathematica Policy Research published in 2004 and 2009 Issue of Stakeholder rights to fair and transparent evaluation
  14. 14. What is the same as Mathematica’s Analyses? Use same statistical methods Statistical programs that take into account the complex multi-stage sample design in estimating standard errors--STATA Same ITT opportunity grouping: TOT participation grouping recognizes UBMS as form of UB Similar model baseline controls Same weights--Mathematica
  15. 15. What is Different from Mathematica’s analyses  Standardize survey data outcomes and 10 years of federal aid data outcome measures by expected high school graduation year  Avoid using early National Student Clearninghouse (NSC) data when coverage too low or not existent; used only for BA degree as supplement for non-responders to surveys  Use all applicable follow-up surveys (3 to 5) not just one round at a time;  Present data with and without project 69 and weighted and un-weighted;  View impact estimates without project 69 as reasonably robust for 74 percent of applicants; view estimates with project 69 as non-robust and use should be avoided especially for estimates of BA impact
  16. 16. Impact on postsecondary enrollment when outcome measures are standardized to expected high school graduation year and when do not use NSC data
  17. 17. Impact on Award of Any Postsecondary Degree or Credential by End of Study Period: Fifth Follow-up Data: Mathematica and Cahalan results (67 of 67 projects)
  18. 18. Impact on BA degree for 66 of the 67 projects that did not have representation issue and severe lack of balance in treatment and control group on academics and expectations at baseline
  19. 19. Control group Alternative Service and Treatment-- Waiting List Drop-outs Waiting List Drop-Outs --26 % of treatment group –kept in ITT First Follow-up survey 20% ITT treatment group non-participation in neither UB or UBMS Survey data--14 percent controls evidence of UB or UBMS participation 60 percent controls and 92 percent treatment group reported some pre-college supplemental service participation
  20. 20. Instrumental Variables Regression used in TOT/CACE and Observational analyses Two stage regression—mitigate selection bias First stage models factors related to participation Second stage --uses results as additional control in the model estimating outcomes
  21. 21. Two Stage Instrumental Variables regression impact results on entry into postsecondary in year after expected high school graduation: Levels of service impact
  22. 22. Two stage instrumental variables regression impact results on BA receipt in 6 years after expected high school graduation: Levels of service impact
  23. 23. Cost of UB and Estimated Impact on Life Time Taxes Paid Based on National Evaluation of Upward Bound
  24. 24. Conclusions  Mathematica contractor conclusions of “no detectable impact” are not robust and are seriously flawed. The reports are not transparent  A credible re-analysis conducted by US Department of ED staff assigned to monitor the contract that corrected for identified sources of study error using NCES statistical standards and US Department of Education Information Quality Guidelines detected strong positive impacts for the Upward Bound  Cost-Benefit analysis using Census Bureau estimates of life time taxes paid shows large relative impact of UB participation compared to cost of the program  Contractor reports continue to do serious harm to the reputation of the Upward Bound program and are in need of withdrawal or correction by Mathematica and the US Department of Education
  25. 25. Further Additional Information  The full text of the COE Request for Correction can be found at http:// www.coenet.us/files/spotlight-COE_Request_for_Correction_of_Mathematica_ Statement of concern by leading researchers in field http:// www.coenet.us/files/spotlight-Statement_of_Concern_011812.pdf  Results of the re-analysis detailing study error issues can be found at: http://www.coenet.us/files/files-Do_the_Conclusions_Change_2009..  The materials that authors of this report (Cahalan and Goodwin 2014) submitted to the What Works Clearinghouse (WWC) in the “Request to Rescind the WWC Rating” are available at http://www. coenet.us/WWC_request_to_rescind

×