SlideShare uma empresa Scribd logo
1 de 16
Managing and AnalyzingGlobal Health Data Seattle, August 30, 2011 Peter Speyer, Director of Data Development
IHME Background Global institute dedicated to providing independent, rigorous, and scientific measurements and evaluations to accelerate progress on global health Part of the Department of Global Health at the University of Washington Funded by the Bill & Melinda Gates Foundation and the State of Washington (‘core funding’), and other funders through specific research grants Created in 2007 70 researchers, 30 staff 2
IHME Mission 	Our goal isto improve the health of the world’s populationsby providing the best informationon population health 3
4
Health Data 5 Health Data Innovation Patient engagement Open data Health apps
Key Health Data Challenges 6 Find & access data Use data Dissemi-natedata
Key Health Data Challenges Lack of transparency Timeliness of data Lack of documentation Access vs. privacy 7 Find & access data Use data Dissemi-natedata
Key Health Data Challenges Sheer quantity of data files (30TB, 20K+ source datasets, 40M files) Diverse source data types and formats (pdf, csv, SPSS, CSPro, …) Data quality issues 8 Find & access data Use data Dissemi-natedata
Key Health Data Challenges Make results data engaging  Accountability: share results, code, source data Accommodate diverse audiences (expertise, geographies) 9 Find & access data Use data Dissemi-natedata
Example: Global Burden of Disease Mortality & causes of death Sources: census, surveys, vital registration, verbal autopsy Estimates: covariate models, spatial-temporal regressions; weighted combination of models Morbidity Sources: Literature reviews, surveys, registries,hospital data Disease modeling: compartmental Bayesian model Health severity weights Burden of disease DALYnator 10 300 diseases 40 risk factors 21 regions 1990, 2005, 2010
GBD Country Years, Causes of Death 1950-2009 11
GBD Country Years, Causes of Death 1950-2009 12
Solutions: Computing Infrastructure Analysis with statistical packages Projects with 100K+ lines of code File system  60TB disc space Redundant backup Cluster with 63 nodes (+300% in 2011), ~2000 cores Runs 24x7, very little downtime Virtual environments to test new applications, servethem to collaborators, etc. 13
Solutions: Global Health Data Exchange Objectives Approach Implementation Transparency => data catalog Access => data repository Information => data community (future) One record per dataset Standardized metadata Internal users (10K records): files on file server External users (5K records): files for download CMS: Drupal  Search: SOLR 14
15
Thank you!speyer@uw.edu@peterspeyerwww.ghdx.org Peter Speyer Director of Data Development

Mais conteúdo relacionado

Mais procurados

dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...dkNET
 
NIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - HandoutNIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - HandoutIUPUI
 
CAPTURING DATA PROVENANCE WITH A USER-DRIVEN FEEDBACK APPROACH
CAPTURING DATA PROVENANCE WITH A USER-DRIVEN FEEDBACK APPROACHCAPTURING DATA PROVENANCE WITH A USER-DRIVEN FEEDBACK APPROACH
CAPTURING DATA PROVENANCE WITH A USER-DRIVEN FEEDBACK APPROACHAnusuriya Devaraju
 
EDI Training Module 12: An Introduction to Metadata and Data Repositories
EDI Training Module 12:  An Introduction to Metadata and Data RepositoriesEDI Training Module 12:  An Introduction to Metadata and Data Repositories
EDI Training Module 12: An Introduction to Metadata and Data RepositoriesEnvironmental Data Initiative
 
RapidMiner, an entrance to explore MIMIC-III?
RapidMiner, an entrance to explore MIMIC-III?RapidMiner, an entrance to explore MIMIC-III?
RapidMiner, an entrance to explore MIMIC-III?Sven Van Poucke, MD, PhD
 
Data curation
Data curationData curation
Data curationealtmyer
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseAnita de Waard
 
Possibilities for integrating model-related data in computational biology (DI...
Possibilities for integrating model-related data in computational biology (DI...Possibilities for integrating model-related data in computational biology (DI...
Possibilities for integrating model-related data in computational biology (DI...University Medicine Greifswald
 
David Van Enckevort - FAIR sample and data access
David Van Enckevort - FAIR sample and data access David Van Enckevort - FAIR sample and data access
David Van Enckevort - FAIR sample and data access DataSciSIG
 
NIH Data Science Special Interest Group
NIH Data Science Special Interest GroupNIH Data Science Special Interest Group
NIH Data Science Special Interest GroupYaffa Rubinstien
 
Ways for researchers to store, share, discover, and use data_Cousijn
Ways for researchers to store, share, discover, and use data_CousijnWays for researchers to store, share, discover, and use data_Cousijn
Ways for researchers to store, share, discover, and use data_CousijnPlatforma Otwartej Nauki
 
Data Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn WoolfreyData Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn Woolfreypvhead123
 
Collaboratively creating a network of ideas, data and software
Collaboratively creating a network of ideas, data and softwareCollaboratively creating a network of ideas, data and software
Collaboratively creating a network of ideas, data and softwareAnita de Waard
 
NIH Big Data to Knowledge (BD2K)
NIH Big Data to Knowledge (BD2K)NIH Big Data to Knowledge (BD2K)
NIH Big Data to Knowledge (BD2K)Lance K. Manning
 
Data Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemData Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemWarren Kibbe
 
Embi cri review-2012-final
Embi cri review-2012-finalEmbi cri review-2012-final
Embi cri review-2012-finalPeter Embi
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE
 

Mais procurados (20)

dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
 
NIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - HandoutNIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - Handout
 
CAPTURING DATA PROVENANCE WITH A USER-DRIVEN FEEDBACK APPROACH
CAPTURING DATA PROVENANCE WITH A USER-DRIVEN FEEDBACK APPROACHCAPTURING DATA PROVENANCE WITH A USER-DRIVEN FEEDBACK APPROACH
CAPTURING DATA PROVENANCE WITH A USER-DRIVEN FEEDBACK APPROACH
 
EDI Training Module 12: An Introduction to Metadata and Data Repositories
EDI Training Module 12:  An Introduction to Metadata and Data RepositoriesEDI Training Module 12:  An Introduction to Metadata and Data Repositories
EDI Training Module 12: An Introduction to Metadata and Data Repositories
 
Open Data for Research
Open Data for ResearchOpen Data for Research
Open Data for Research
 
RapidMiner, an entrance to explore MIMIC-III?
RapidMiner, an entrance to explore MIMIC-III?RapidMiner, an entrance to explore MIMIC-III?
RapidMiner, an entrance to explore MIMIC-III?
 
Data curation
Data curationData curation
Data curation
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 
Open Access as a Means to Produce High Quality Data
Open Access as a Means to Produce High Quality DataOpen Access as a Means to Produce High Quality Data
Open Access as a Means to Produce High Quality Data
 
Possibilities for integrating model-related data in computational biology (DI...
Possibilities for integrating model-related data in computational biology (DI...Possibilities for integrating model-related data in computational biology (DI...
Possibilities for integrating model-related data in computational biology (DI...
 
David Van Enckevort - FAIR sample and data access
David Van Enckevort - FAIR sample and data access David Van Enckevort - FAIR sample and data access
David Van Enckevort - FAIR sample and data access
 
NIH Data Science Special Interest Group
NIH Data Science Special Interest GroupNIH Data Science Special Interest Group
NIH Data Science Special Interest Group
 
Ways for researchers to store, share, discover, and use data_Cousijn
Ways for researchers to store, share, discover, and use data_CousijnWays for researchers to store, share, discover, and use data_Cousijn
Ways for researchers to store, share, discover, and use data_Cousijn
 
Data Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn WoolfreyData Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn Woolfrey
 
Fair by design
Fair by designFair by design
Fair by design
 
Collaboratively creating a network of ideas, data and software
Collaboratively creating a network of ideas, data and softwareCollaboratively creating a network of ideas, data and software
Collaboratively creating a network of ideas, data and software
 
NIH Big Data to Knowledge (BD2K)
NIH Big Data to Knowledge (BD2K)NIH Big Data to Knowledge (BD2K)
NIH Big Data to Knowledge (BD2K)
 
Data Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemData Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health System
 
Embi cri review-2012-final
Embi cri review-2012-finalEmbi cri review-2012-final
Embi cri review-2012-final
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data Sharing
 

Semelhante a Managing and Analyzing Health Data (VLDB Conference)

Realising the potential of Health Data Science: opportunities and challenges ...
Realising the potential of Health Data Science:opportunities and challenges ...Realising the potential of Health Data Science:opportunities and challenges ...
Realising the potential of Health Data Science: opportunities and challenges ...Paolo Missier
 
Secure Data Sharing and Related Matters – An NIH View
Secure Data Sharing and Related Matters – An NIH ViewSecure Data Sharing and Related Matters – An NIH View
Secure Data Sharing and Related Matters – An NIH ViewPhilip Bourne
 
Khoury ashg2014
Khoury ashg2014Khoury ashg2014
Khoury ashg2014muink
 
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...Paolo Missier
 
From Research to Practice - New Models for Data-sharing and Collaboration to ...
From Research to Practice - New Models for Data-sharing and Collaboration to ...From Research to Practice - New Models for Data-sharing and Collaboration to ...
From Research to Practice - New Models for Data-sharing and Collaboration to ...Health Data Consortium
 
Health Data Innovation (Wolfram Data Summit)
Health Data Innovation (Wolfram Data Summit)Health Data Innovation (Wolfram Data Summit)
Health Data Innovation (Wolfram Data Summit)Peter Speyer
 
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...Robert Grossman
 
The Future of Personalized Medicine
The Future of Personalized MedicineThe Future of Personalized Medicine
The Future of Personalized MedicineEdgewater
 
One Funder’s View for Advancing Open Science
One Funder’s View for Advancing Open ScienceOne Funder’s View for Advancing Open Science
One Funder’s View for Advancing Open SciencePhilip Bourne
 
The Learning Health System: Thinking and Acting Across Scales
The Learning Health System: Thinking and Acting Across ScalesThe Learning Health System: Thinking and Acting Across Scales
The Learning Health System: Thinking and Acting Across ScalesPhilip Payne
 
Sun==big data analytics for health care
Sun==big data analytics for health careSun==big data analytics for health care
Sun==big data analytics for health careAravindharamanan S
 
Methodologies for Addressing Privacy and Social Issues in Health Data: A Case...
Methodologies for Addressing Privacy and Social Issues in Health Data: A Case...Methodologies for Addressing Privacy and Social Issues in Health Data: A Case...
Methodologies for Addressing Privacy and Social Issues in Health Data: A Case...Trilateral Research
 
Researcher Dilemmas using Behavioral Big Data in Healthcare (INFORMS DMDA Wo...
Researcher Dilemmas  using Behavioral Big Data in Healthcare (INFORMS DMDA Wo...Researcher Dilemmas  using Behavioral Big Data in Healthcare (INFORMS DMDA Wo...
Researcher Dilemmas using Behavioral Big Data in Healthcare (INFORMS DMDA Wo...Galit Shmueli
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global EcosystemPhilip Bourne
 
The shared value of personal and population data
The shared value of personal and population dataThe shared value of personal and population data
The shared value of personal and population dataWessel Kraaij
 

Semelhante a Managing and Analyzing Health Data (VLDB Conference) (20)

Managing and Analyzing Global Health Data
Managing and Analyzing Global Health DataManaging and Analyzing Global Health Data
Managing and Analyzing Global Health Data
 
Realising the potential of Health Data Science: opportunities and challenges ...
Realising the potential of Health Data Science:opportunities and challenges ...Realising the potential of Health Data Science:opportunities and challenges ...
Realising the potential of Health Data Science: opportunities and challenges ...
 
Secure Data Sharing and Related Matters – An NIH View
Secure Data Sharing and Related Matters – An NIH ViewSecure Data Sharing and Related Matters – An NIH View
Secure Data Sharing and Related Matters – An NIH View
 
White_matter_Ouellette_2022-06-07.pdf
White_matter_Ouellette_2022-06-07.pdfWhite_matter_Ouellette_2022-06-07.pdf
White_matter_Ouellette_2022-06-07.pdf
 
Khoury ashg2014
Khoury ashg2014Khoury ashg2014
Khoury ashg2014
 
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...
 
From Research to Practice: New Models for Data-sharing and Collaboration to I...
From Research to Practice: New Models for Data-sharing and Collaboration to I...From Research to Practice: New Models for Data-sharing and Collaboration to I...
From Research to Practice: New Models for Data-sharing and Collaboration to I...
 
From Research to Practice - New Models for Data-sharing and Collaboration to ...
From Research to Practice - New Models for Data-sharing and Collaboration to ...From Research to Practice - New Models for Data-sharing and Collaboration to ...
From Research to Practice - New Models for Data-sharing and Collaboration to ...
 
Health Data Innovation (Wolfram Data Summit)
Health Data Innovation (Wolfram Data Summit)Health Data Innovation (Wolfram Data Summit)
Health Data Innovation (Wolfram Data Summit)
 
Izant openscience
Izant openscienceIzant openscience
Izant openscience
 
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
 
The Future of Personalized Medicine
The Future of Personalized MedicineThe Future of Personalized Medicine
The Future of Personalized Medicine
 
One Funder’s View for Advancing Open Science
One Funder’s View for Advancing Open ScienceOne Funder’s View for Advancing Open Science
One Funder’s View for Advancing Open Science
 
The Learning Health System: Thinking and Acting Across Scales
The Learning Health System: Thinking and Acting Across ScalesThe Learning Health System: Thinking and Acting Across Scales
The Learning Health System: Thinking and Acting Across Scales
 
Sun==big data analytics for health care
Sun==big data analytics for health careSun==big data analytics for health care
Sun==big data analytics for health care
 
Methodologies for Addressing Privacy and Social Issues in Health Data: A Case...
Methodologies for Addressing Privacy and Social Issues in Health Data: A Case...Methodologies for Addressing Privacy and Social Issues in Health Data: A Case...
Methodologies for Addressing Privacy and Social Issues in Health Data: A Case...
 
Researcher Dilemmas using Behavioral Big Data in Healthcare (INFORMS DMDA Wo...
Researcher Dilemmas  using Behavioral Big Data in Healthcare (INFORMS DMDA Wo...Researcher Dilemmas  using Behavioral Big Data in Healthcare (INFORMS DMDA Wo...
Researcher Dilemmas using Behavioral Big Data in Healthcare (INFORMS DMDA Wo...
 
50120140506011
5012014050601150120140506011
50120140506011
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global Ecosystem
 
The shared value of personal and population data
The shared value of personal and population dataThe shared value of personal and population data
The shared value of personal and population data
 

Mais de Peter Speyer

Population health measurement - key takeaways from Global Burden of Disease s...
Population health measurement - key takeaways from Global Burden of Disease s...Population health measurement - key takeaways from Global Burden of Disease s...
Population health measurement - key takeaways from Global Burden of Disease s...Peter Speyer
 
Communicating Data for Impact
Communicating Data for ImpactCommunicating Data for Impact
Communicating Data for ImpactPeter Speyer
 
150209 communicating data
150209 communicating data150209 communicating data
150209 communicating dataPeter Speyer
 
Population Health - Data & Visualizations for Decision Making
Population Health - Data & Visualizations for Decision MakingPopulation Health - Data & Visualizations for Decision Making
Population Health - Data & Visualizations for Decision MakingPeter Speyer
 
NAPHSIS Keynote: Vital Records - Vital Input for Population Health Measurement
NAPHSIS Keynote: Vital Records - Vital Input for Population Health MeasurementNAPHSIS Keynote: Vital Records - Vital Input for Population Health Measurement
NAPHSIS Keynote: Vital Records - Vital Input for Population Health MeasurementPeter Speyer
 
Big Data in Global Health: Steps to get data to audiences
Big Data in Global Health: Steps to get data to audiencesBig Data in Global Health: Steps to get data to audiences
Big Data in Global Health: Steps to get data to audiencesPeter Speyer
 
Geospatial Methods
Geospatial MethodsGeospatial Methods
Geospatial MethodsPeter Speyer
 
Global Burden of Disease - Big Data in Global Health
Global Burden of Disease - Big Data in Global HealthGlobal Burden of Disease - Big Data in Global Health
Global Burden of Disease - Big Data in Global HealthPeter Speyer
 
Global Burden of Disease at Wolfram Data Summit
Global Burden of Disease at Wolfram Data SummitGlobal Burden of Disease at Wolfram Data Summit
Global Burden of Disease at Wolfram Data SummitPeter Speyer
 
Data Visualization Workshop at GHME
Data Visualization Workshop at GHMEData Visualization Workshop at GHME
Data Visualization Workshop at GHMEPeter Speyer
 
Open Government Data at IOGDC
Open Government Data at IOGDCOpen Government Data at IOGDC
Open Government Data at IOGDCPeter Speyer
 

Mais de Peter Speyer (12)

Population health measurement - key takeaways from Global Burden of Disease s...
Population health measurement - key takeaways from Global Burden of Disease s...Population health measurement - key takeaways from Global Burden of Disease s...
Population health measurement - key takeaways from Global Burden of Disease s...
 
Communicating Data for Impact
Communicating Data for ImpactCommunicating Data for Impact
Communicating Data for Impact
 
150209 communicating data
150209 communicating data150209 communicating data
150209 communicating data
 
Population Health - Data & Visualizations for Decision Making
Population Health - Data & Visualizations for Decision MakingPopulation Health - Data & Visualizations for Decision Making
Population Health - Data & Visualizations for Decision Making
 
NAPHSIS Keynote: Vital Records - Vital Input for Population Health Measurement
NAPHSIS Keynote: Vital Records - Vital Input for Population Health MeasurementNAPHSIS Keynote: Vital Records - Vital Input for Population Health Measurement
NAPHSIS Keynote: Vital Records - Vital Input for Population Health Measurement
 
Big Data in Global Health: Steps to get data to audiences
Big Data in Global Health: Steps to get data to audiencesBig Data in Global Health: Steps to get data to audiences
Big Data in Global Health: Steps to get data to audiences
 
Geospatial Methods
Geospatial MethodsGeospatial Methods
Geospatial Methods
 
Global Burden of Disease - Big Data in Global Health
Global Burden of Disease - Big Data in Global HealthGlobal Burden of Disease - Big Data in Global Health
Global Burden of Disease - Big Data in Global Health
 
Global Burden of Disease at Wolfram Data Summit
Global Burden of Disease at Wolfram Data SummitGlobal Burden of Disease at Wolfram Data Summit
Global Burden of Disease at Wolfram Data Summit
 
Data Visualization Workshop at GHME
Data Visualization Workshop at GHMEData Visualization Workshop at GHME
Data Visualization Workshop at GHME
 
Open Government Data at IOGDC
Open Government Data at IOGDCOpen Government Data at IOGDC
Open Government Data at IOGDC
 
GHDx Launch
GHDx LaunchGHDx Launch
GHDx Launch
 

Último

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 

Managing and Analyzing Health Data (VLDB Conference)

  • 1. Managing and AnalyzingGlobal Health Data Seattle, August 30, 2011 Peter Speyer, Director of Data Development
  • 2. IHME Background Global institute dedicated to providing independent, rigorous, and scientific measurements and evaluations to accelerate progress on global health Part of the Department of Global Health at the University of Washington Funded by the Bill & Melinda Gates Foundation and the State of Washington (‘core funding’), and other funders through specific research grants Created in 2007 70 researchers, 30 staff 2
  • 3. IHME Mission Our goal isto improve the health of the world’s populationsby providing the best informationon population health 3
  • 4. 4
  • 5. Health Data 5 Health Data Innovation Patient engagement Open data Health apps
  • 6. Key Health Data Challenges 6 Find & access data Use data Dissemi-natedata
  • 7. Key Health Data Challenges Lack of transparency Timeliness of data Lack of documentation Access vs. privacy 7 Find & access data Use data Dissemi-natedata
  • 8. Key Health Data Challenges Sheer quantity of data files (30TB, 20K+ source datasets, 40M files) Diverse source data types and formats (pdf, csv, SPSS, CSPro, …) Data quality issues 8 Find & access data Use data Dissemi-natedata
  • 9. Key Health Data Challenges Make results data engaging Accountability: share results, code, source data Accommodate diverse audiences (expertise, geographies) 9 Find & access data Use data Dissemi-natedata
  • 10. Example: Global Burden of Disease Mortality & causes of death Sources: census, surveys, vital registration, verbal autopsy Estimates: covariate models, spatial-temporal regressions; weighted combination of models Morbidity Sources: Literature reviews, surveys, registries,hospital data Disease modeling: compartmental Bayesian model Health severity weights Burden of disease DALYnator 10 300 diseases 40 risk factors 21 regions 1990, 2005, 2010
  • 11. GBD Country Years, Causes of Death 1950-2009 11
  • 12. GBD Country Years, Causes of Death 1950-2009 12
  • 13. Solutions: Computing Infrastructure Analysis with statistical packages Projects with 100K+ lines of code File system 60TB disc space Redundant backup Cluster with 63 nodes (+300% in 2011), ~2000 cores Runs 24x7, very little downtime Virtual environments to test new applications, servethem to collaborators, etc. 13
  • 14. Solutions: Global Health Data Exchange Objectives Approach Implementation Transparency => data catalog Access => data repository Information => data community (future) One record per dataset Standardized metadata Internal users (10K records): files on file server External users (5K records): files for download CMS: Drupal Search: SOLR 14
  • 15. 15
  • 16. Thank you!speyer@uw.edu@peterspeyerwww.ghdx.org Peter Speyer Director of Data Development