This document discusses opportunities for collaboration between researchers working in systematic reviews and electronic discovery (e-discovery). It notes similarities in the challenges both fields face, including the need for high recall with bounded costs and reliance on multi-stage review pipelines. The document proposes that technologies developed for semi-automated citation screening and crowdsourcing could help address current limitations. It concludes by encouraging information retrieval researchers to investigate open problems in systematic reviews as opportunities to advance technologies beyond other tasks and help bring together interested parties through forums like the TREC Total Recall track.
Systematic Review is e-Discovery in Doctor’s Clothing
1. Systematic Review is e-Discovery
in Doctor’s Clothing
Joint work with
Matt Lease
ir.ischool.utexas.edu
slideshare.net/mattlease
@mattlease
ml@utexas.edu
Gordon V. Cormack (U. Waterloo) An Thanh Nguyen (U. Texas)
Thomas A. Trikalinos (Brown U.) Byron C. Wallace (U. Texas)
2. “The place where people & technology meet”
~ Wobbrock et al., 2009
www.ischools.org
2
3. • System-Reviews
• Electronic Discovery (e-Discovery)
• Toward a Joint Research Agenda
3Matt Lease <ml@utexas.edu>
Roadmap
4. • System-Reviews
• Electronic Discovery (e-Discovery)
• Toward a Joint Research Agenda
4Matt Lease <ml@utexas.edu>
Roadmap
5. Evidence-Based Medicine n.
The conscientious, explicit and judicious
use of current best evidence in making
decisions about the care of
individual patients
5
8. On average, 75 articles describing results from
clinical trials are published every day.
Bastian, PLoS Med, 2010
The median length to complete a single review: 1110
person-hours.
Allen & Olkin, JAMA, 1998
8
16. Manual Review does not Scale
16
Paul, George L., and Jason R. Baron.
Information inflation: Can the legal
system adapt? Rich. JL & Tech. 13 (2007).
17. IR Research in e-Discovery
• NIST TREC Track: 2006-2011
• Oard & Webber, FnTIR Book, 2013
• A variety of published work at SIGIR++
– e.g., Cormack & Grossman, SIGIR 2016
17
18. • System-Reviews
• Electronic Discovery (e-Discovery)
• Toward a Joint Research Agenda
18Matt Lease <ml@utexas.edu>
Roadmap
19. Commonalities
• Need high-recall with bounded cost
• Follow 3-Stage Pipeline Today
– Boolean query
– Screening (traditionally manual by experts)
– Final review & use
• Pipeline approach useful but limits improvement
– overall framing & unrecoverable errors
• Limiting reliance on experts
– Traditionally assumed to be infallible 19
20. Can we crowdsource screening?
Michael Mortenson, Byron C. Wallace, Gaelen Adam, Tom Trikalinos and Tim Kraska.
Crowdsourcing Citation Screening for Systematic Reviews. (Under review).
20
24. Conclusion
• Systematic Review & e-Discovery have much in common,
but SR has received relatively little attention in IR
– Open problems & current assumptions give IR researchers
fertile opportunities for research beyond other IR tasks
– Public test collections available for both
• github.com/bwallace/crowd-sourced-ebm
• Aaron Cohen’s: http://skynet.ohsu.edu/~cohenaa/systematic-drug-
class-review-data.html
– Reading list: https://github.com/bwallace/automating-ebm-
resources/wiki/Papers
• TREC Total Recall Track (trec-total-recall.org) offers a
great forum for bringing together those interested
24