2. IHME Background Global institute dedicated to providing independent, rigorous, and scientific measurements and evaluations to accelerate progress on global health Part of the Department of Global Health at the University of Washington Funded by the Bill & Melinda Gates Foundation and the State of Washington (‘core funding’), and other funders through specific research grants Created in 2007 70 researchers, 30 staff 2
3. IHME Mission Our goal isto improve the health of the world’s populationsby providing the best informationon population health 3
5. Health Data 5 Health Data Innovation Patient engagement Open data Health apps
6. Key Health Data Challenges 6 Find & access data Use data Dissemi-natedata
7. Key Health Data Challenges Lack of transparency Timeliness of data Lack of documentation Access vs. privacy 7 Find & access data Use data Dissemi-natedata
8. Key Health Data Challenges Sheer quantity of data files (30TB, 20K+ source datasets, 40M files) Diverse source data types and formats (pdf, csv, SPSS, CSPro, …) Data quality issues 8 Find & access data Use data Dissemi-natedata
9. Key Health Data Challenges Make results data engaging Accountability: share results, code, source data Accommodate diverse audiences (expertise, geographies) 9 Find & access data Use data Dissemi-natedata
10. Example: Global Burden of Disease Mortality & causes of death Sources: census, surveys, vital registration, verbal autopsy Estimates: covariate models, spatial-temporal regressions; weighted combination of models Morbidity Sources: Literature reviews, surveys, registries,hospital data Disease modeling: compartmental Bayesian model Health severity weights Burden of disease DALYnator 10 300 diseases 40 risk factors 21 regions 1990, 2005, 2010
13. Solutions: Computing Infrastructure Analysis with statistical packages Projects with 100K+ lines of code File system 60TB disc space Redundant backup Cluster with 63 nodes (+300% in 2011), ~2000 cores Runs 24x7, very little downtime Virtual environments to test new applications, servethem to collaborators, etc. 13
14. Solutions: Global Health Data Exchange Objectives Approach Implementation Transparency => data catalog Access => data repository Information => data community (future) One record per dataset Standardized metadata Internal users (10K records): files on file server External users (5K records): files for download CMS: Drupal Search: SOLR 14