SlideShare uma empresa Scribd logo
1 de 26
The PDB An Exemplar for Data Science To
Date, But What About the Future?
Philip E. Bourne Ph.D.
Associate Director for Data Science
National Institutes of Health
Background
6/12 2/14 3/14
• Findings:
• Sharing data & software through catalogs
• Support methods and applications development
• Need more training
• Hire CSIO
• Continued support throughout the lifecycle
http://acd.od.nih.gov/diwg.htm
Motivation for This Talk
Source Michael Bell http://homepages.cs.ncl.ac.uk/m.j.bell1/blog/?p=830
More Motivation
The way we fund and operate
biomedical databases will not scale.
How do we keep the best features of
todays resources but also respond to
shrinking budgets and changes in the
way we do science?
Lets address this question using the
PDB as an example
Disclaimer: This is NOT a talk about
the PDB per se, but a talk about data
resources in general, but using the
PDB as an example since we are all
familiar with it and it is considered an
exemplar by most stakeholders
Good News: We Trust the PDB
PDB
Trust in the data
is perhaps the PDB’s
biggest achievement
Good News: Trust
 Trust is like compound interest
 Comes from listening
 Comes from engaging the community in every aspect
of the process
 Comes from data consistency and level of annotation
 Comes from responsiveness
 Comes from the quality of the delivery service
Good News/Bad News Re Data Quality
 Good News:
– If done right in the
beginning 25% of the
PDB’s budget could have
been saved
– Ontologies can work
– Automation has reduced
cost even as the amount
of data has increased
– Reproducibility is
improved
 Bad News:
– Complex ontologies slow
adoption
– All data are created
equal
– Annotation is limited
Good News/Bad News Re Community
 Good News:
– The community is
engaged
– The community has
driven data sharing
 Bad News:
– The community does not
reduce costs through
active participation
– There is insufficient
reward for being part of
the community e.g. as an
annotator
How we do science is changing. Do
data resources including the PDB best
serve the needs of the user at this
point?
How is Science Changing?
 More interdisciplinary
 More translational
 More access to diverse data types
 More computational
 More collaborative
Good News/Bad News for the PDB in
this Changing Landscape
 Bad News:
– Interface complex and
uni-data oriented
– Data accessible;
methods accessible (sort
of); but not together
– Significant redundancy in
services offered
 Good News:
– Annotation!
– Demand is increasing
– Integrated with other
data types
– Restful services
General Problem Statement:
How to insure a high quality
annotated data source that provides
the optimal environment for
accessibility and analysis by a broad
community of diverse users?
Okay so what can the funders do to
address a situation where really the
PDB is currently a best case
scenario?
1. Encourage more
understanding for how
existing data are used
* http://www.cdc.gov/h1n1flu/estimates/April_March_13.htm
Jan. 2008 Jan. 2009 Jan. 2010Jul. 2009Jul. 2008 Jul. 2010
1RUZ: 1918 H1 Hemagglutinin
Structure Summary page activity for
H1N1 Influenza related structures
3B7E: Neuraminidase of A/Brevig Mission/1/1918
H1N1 strain in complex with zanamivir
[Andreas Prlic]
We Need to Learn from Industries Whose
Livelihood Addresses the Question of Use
2. Address the Issue that
Scholarship is Broken
 I have a paper with 17,500 citations that no one has
ever read
 I have papers in PLOS ONE that have more citations
than ones in PNAS
 I have data sets I am proud of few places to put
them
 I edited a journal but it did not count for much
3. Address the Reward System
4. Enable Reproducibility
 Much of the research life cycle is now digital -
encourage the reliability, accessibility, findability,
usability of data, methods, narrative, publications etc.
 How?
 Data sharing plans
 Standards frameworks
 Data and software catalogs
 PubMedCentral
? The Commons – PMC for the complete lifecycle
? Machine readable data sharing plans
? Small funding to communities
? Support for training and best practices in eScholarship
5. Establish The Commons
 Public/private partnership
 Work with IC’s, NCBI and CIT to identify and run
pilots – cloud, HPC centers
 Port DbGAP to the cloud
? Experiment with new funding strategies
 Evaluate
Sustainability and Sharing: The Commons
Data
The Long Tail
Core Facilities/HS Centers
Clinical /Patient
The Why:
Data Sharing Plans
The
Commons
Government
The How:
Data
Discovery
Index
Sustainable
Storage
Quality
Scientific
Discovery
Usability
Security/
Privacy
Commons == Extramural NCBI == Research Object Sandbox == Collaborative Environment
The End Game:
KnowledgeNIH
Awardees
Private
Sector
Metrics/
Standards
Rest of
Academia
Software Standards
Index
BD2K
Centers
Cloud, Research Objects,
What Does the Commons Enable?
 Dropbox like storage
 The opportunity to apply quality metrics
 Bring compute to the data
 A place to collaborate
 A place to discover
http://100plus.com/wp-content/uploads/Data-Commons-3-
1024x825.png
The PDB in the Commons
 Components:
– Annotated collection of data files
– API’s to access these data files
– Example methods using these APIs
 Potential outcomes
– Nothing happens?
– A new breed of developer starts to use PDB data in new
ways ?
– The casual user has a broader set of services that
previously?
– Quality declines?
Some Acknowledgements
 Eric Green & Mark Guyer (NHGRI)
 Jennie Larkin (NHLBI)
 Leigh Finnegan (NHGRI)
 Vivien Bonazzi (NHGRI)
 Michelle Dunn (NCI)
 Mike Huerta (NLM)
 David Lipman (NLM)
 Jim Ostell (NLM)
 Andrea Norris (CIT)
 Peter Lyster (NIGMS)
 All the over 100 folks on the BD2K team
NIHNIH……
Turning Discovery Into HealthTurning Discovery Into Health
philip.bourne@nih.gov

Mais conteúdo relacionado

Mais procurados

Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?Philip Bourne
 
Understanding the Big Data Enterprise
Understanding the Big Data EnterpriseUnderstanding the Big Data Enterprise
Understanding the Big Data EnterprisePhilip Bourne
 
UKSG Conference 2017 Breakout - How publishers can thrive in an open access m...
UKSG Conference 2017 Breakout - How publishers can thrive in an open access m...UKSG Conference 2017 Breakout - How publishers can thrive in an open access m...
UKSG Conference 2017 Breakout - How publishers can thrive in an open access m...UKSG: connecting the knowledge community
 
The Future of Data Science @ UVA
The Future of Data Science @ UVAThe Future of Data Science @ UVA
The Future of Data Science @ UVAPhilip Bourne
 
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...Jisc
 
Librarians' Involvement with CRIS Developments
Librarians' Involvement with CRIS DevelopmentsLibrarians' Involvement with CRIS Developments
Librarians' Involvement with CRIS Developmentssherif user group
 
RDAP14: Collaboration and tension between institutions and units providing da...
RDAP14: Collaboration and tension between institutions and units providing da...RDAP14: Collaboration and tension between institutions and units providing da...
RDAP14: Collaboration and tension between institutions and units providing da...ASIS&T
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science LandscapePhilip Bourne
 
If Data Are The New Oil, How Do We Prevent Global Warming?
If Data Are The New Oil, How Do We Prevent Global Warming?If Data Are The New Oil, How Do We Prevent Global Warming?
If Data Are The New Oil, How Do We Prevent Global Warming?Philip Bourne
 
Credo reference promoting resources workshop edina slides
Credo reference promoting resources workshop   edina slidesCredo reference promoting resources workshop   edina slides
Credo reference promoting resources workshop edina slidesAndrew Bevan
 
Open Book Publishers, Rupert Gatti
Open Book Publishers, Rupert GattiOpen Book Publishers, Rupert Gatti
Open Book Publishers, Rupert GattiOAbooks
 
Library Data Management Services
Library Data Management ServicesLibrary Data Management Services
Library Data Management ServicesKeith Webster
 
Reinforcing the Role of the Library: Communicating Value, Increasing Access a...
Reinforcing the Role of the Library: Communicating Value, Increasing Access a...Reinforcing the Role of the Library: Communicating Value, Increasing Access a...
Reinforcing the Role of the Library: Communicating Value, Increasing Access a...Charleston Conference
 
Curating the Scholarly Record: Data Management and Research Libraries
Curating the Scholarly Record: Data Management and Research LibrariesCurating the Scholarly Record: Data Management and Research Libraries
Curating the Scholarly Record: Data Management and Research LibrariesKeith Webster
 
Bw dave pattern lidp
Bw dave pattern lidpBw dave pattern lidp
Bw dave pattern lidpgregynog
 
2010 SC Environment Presentation
2010 SC Environment Presentation2010 SC Environment Presentation
2010 SC Environment PresentationJulie Schneider
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance SustainabilityPhilip Bourne
 

Mais procurados (20)

Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?
 
Understanding the Big Data Enterprise
Understanding the Big Data EnterpriseUnderstanding the Big Data Enterprise
Understanding the Big Data Enterprise
 
UKSG Conference 2017 Breakout - How publishers can thrive in an open access m...
UKSG Conference 2017 Breakout - How publishers can thrive in an open access m...UKSG Conference 2017 Breakout - How publishers can thrive in an open access m...
UKSG Conference 2017 Breakout - How publishers can thrive in an open access m...
 
The Future of Data Science @ UVA
The Future of Data Science @ UVAThe Future of Data Science @ UVA
The Future of Data Science @ UVA
 
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...
 
Librarians' Involvement with CRIS Developments
Librarians' Involvement with CRIS DevelopmentsLibrarians' Involvement with CRIS Developments
Librarians' Involvement with CRIS Developments
 
RDAP14: Collaboration and tension between institutions and units providing da...
RDAP14: Collaboration and tension between institutions and units providing da...RDAP14: Collaboration and tension between institutions and units providing da...
RDAP14: Collaboration and tension between institutions and units providing da...
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
 
Yale Day of Data
Yale Day of Data Yale Day of Data
Yale Day of Data
 
If Data Are The New Oil, How Do We Prevent Global Warming?
If Data Are The New Oil, How Do We Prevent Global Warming?If Data Are The New Oil, How Do We Prevent Global Warming?
If Data Are The New Oil, How Do We Prevent Global Warming?
 
UKSG 2018 Breakout - Setting your cites to open I4OC - Maccallum
UKSG 2018 Breakout - Setting your cites to open I4OC - MaccallumUKSG 2018 Breakout - Setting your cites to open I4OC - Maccallum
UKSG 2018 Breakout - Setting your cites to open I4OC - Maccallum
 
Credo reference promoting resources workshop edina slides
Credo reference promoting resources workshop   edina slidesCredo reference promoting resources workshop   edina slides
Credo reference promoting resources workshop edina slides
 
Open Book Publishers, Rupert Gatti
Open Book Publishers, Rupert GattiOpen Book Publishers, Rupert Gatti
Open Book Publishers, Rupert Gatti
 
Library Data Management Services
Library Data Management ServicesLibrary Data Management Services
Library Data Management Services
 
Reinforcing the Role of the Library: Communicating Value, Increasing Access a...
Reinforcing the Role of the Library: Communicating Value, Increasing Access a...Reinforcing the Role of the Library: Communicating Value, Increasing Access a...
Reinforcing the Role of the Library: Communicating Value, Increasing Access a...
 
Curating the Scholarly Record: Data Management and Research Libraries
Curating the Scholarly Record: Data Management and Research LibrariesCurating the Scholarly Record: Data Management and Research Libraries
Curating the Scholarly Record: Data Management and Research Libraries
 
Bw dave pattern lidp
Bw dave pattern lidpBw dave pattern lidp
Bw dave pattern lidp
 
2010 SC Environment Presentation
2010 SC Environment Presentation2010 SC Environment Presentation
2010 SC Environment Presentation
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance Sustainability
 
Using Social Media to Communicate Your Work
Using Social Media to Communicate Your WorkUsing Social Media to Communicate Your Work
Using Social Media to Communicate Your Work
 

Semelhante a The PDB An Exemplar for Data Science To Date, But What About the Future?

Big Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH PerspectiveBig Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH PerspectivePhilip Bourne
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global EcosystemPhilip Bourne
 
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH     Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH Philip Bourne
 
Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...African Open Science Platform
 
Workshop intro090314
Workshop intro090314Workshop intro090314
Workshop intro090314Philip Bourne
 
Iain Hrynaszkiewicz - Research Integrity: Integrity of the published record
Iain Hrynaszkiewicz - Research Integrity: Integrity of the published recordIain Hrynaszkiewicz - Research Integrity: Integrity of the published record
Iain Hrynaszkiewicz - Research Integrity: Integrity of the published recordJisc
 
Human Genome and Big Data Challenges
Human Genome and Big Data ChallengesHuman Genome and Big Data Challenges
Human Genome and Big Data ChallengesPhilip Bourne
 
PSB2014 A Vision for Biomedical Research
PSB2014 A Vision for Biomedical ResearchPSB2014 A Vision for Biomedical Research
PSB2014 A Vision for Biomedical ResearchPhilip Bourne
 
The Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHThe Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHPhilip Bourne
 
Biomedical Research as an Open Digital Enterprise
Biomedical Research as an Open Digital EnterpriseBiomedical Research as an Open Digital Enterprise
Biomedical Research as an Open Digital EnterprisePhilip Bourne
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonAfrican Open Science Platform
 
Data Science in Biomedicine - Where Are We Headed?
Data Science in Biomedicine - Where Are We Headed?Data Science in Biomedicine - Where Are We Headed?
Data Science in Biomedicine - Where Are We Headed?Philip Bourne
 
Open Science: Where Theory Meets Practice
Open Science: Where Theory Meets PracticeOpen Science: Where Theory Meets Practice
Open Science: Where Theory Meets PracticePhilip Bourne
 
Mind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and PracticeMind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and PracticeLizLyon
 
The NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAGThe NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAGPhilip Bourne
 

Semelhante a The PDB An Exemplar for Data Science To Date, But What About the Future? (20)

Data!
Data!Data!
Data!
 
AMIA 2014
AMIA 2014AMIA 2014
AMIA 2014
 
Big Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH PerspectiveBig Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH Perspective
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global Ecosystem
 
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH     Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH
 
Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...
 
Workshop intro090314
Workshop intro090314Workshop intro090314
Workshop intro090314
 
Data at the NIH
Data at the NIHData at the NIH
Data at the NIH
 
Iain Hrynaszkiewicz - Research Integrity: Integrity of the published record
Iain Hrynaszkiewicz - Research Integrity: Integrity of the published recordIain Hrynaszkiewicz - Research Integrity: Integrity of the published record
Iain Hrynaszkiewicz - Research Integrity: Integrity of the published record
 
Human Genome and Big Data Challenges
Human Genome and Big Data ChallengesHuman Genome and Big Data Challenges
Human Genome and Big Data Challenges
 
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLANINCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
 
PSB2014 A Vision for Biomedical Research
PSB2014 A Vision for Biomedical ResearchPSB2014 A Vision for Biomedical Research
PSB2014 A Vision for Biomedical Research
 
The Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHThe Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIH
 
Biomedical Research as an Open Digital Enterprise
Biomedical Research as an Open Digital EnterpriseBiomedical Research as an Open Digital Enterprise
Biomedical Research as an Open Digital Enterprise
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon Hodson
 
Data Science in Biomedicine - Where Are We Headed?
Data Science in Biomedicine - Where Are We Headed?Data Science in Biomedicine - Where Are We Headed?
Data Science in Biomedicine - Where Are We Headed?
 
Open Science: Where Theory Meets Practice
Open Science: Where Theory Meets PracticeOpen Science: Where Theory Meets Practice
Open Science: Where Theory Meets Practice
 
Mind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and PracticeMind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and Practice
 
The NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAGThe NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAG
 
Chapter 12
Chapter 12Chapter 12
Chapter 12
 

Mais de Philip Bourne

Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 
AI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationAI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationPhilip Bourne
 
AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingPhilip Bourne
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityPhilip Bourne
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?Philip Bourne
 
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangeData Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangePhilip Bourne
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug DiscoveryPhilip Bourne
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AlonePhilip Bourne
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchPhilip Bourne
 
AI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data ScienceAI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data SciencePhilip Bourne
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewPhilip Bourne
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptxPhilip Bourne
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Philip Bourne
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision EducationPhilip Bourne
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data SciencePhilip Bourne
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Philip Bourne
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Philip Bourne
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesPhilip Bourne
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in ResearchPhilip Bourne
 

Mais de Philip Bourne (20)

Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
AI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationAI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a Conversation
 
AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We Going
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data Sustainability
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?
 
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangeData Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything Change
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug Discovery
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not Alone
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in Research
 
AI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data ScienceAI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data Science
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's View
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptx
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision Education
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data Science
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular Scales
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in Research
 

Último

Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 

Último (20)

Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 

The PDB An Exemplar for Data Science To Date, But What About the Future?

  • 1. The PDB An Exemplar for Data Science To Date, But What About the Future? Philip E. Bourne Ph.D. Associate Director for Data Science National Institutes of Health
  • 2. Background 6/12 2/14 3/14 • Findings: • Sharing data & software through catalogs • Support methods and applications development • Need more training • Hire CSIO • Continued support throughout the lifecycle http://acd.od.nih.gov/diwg.htm
  • 3. Motivation for This Talk Source Michael Bell http://homepages.cs.ncl.ac.uk/m.j.bell1/blog/?p=830
  • 5. The way we fund and operate biomedical databases will not scale. How do we keep the best features of todays resources but also respond to shrinking budgets and changes in the way we do science? Lets address this question using the PDB as an example
  • 6. Disclaimer: This is NOT a talk about the PDB per se, but a talk about data resources in general, but using the PDB as an example since we are all familiar with it and it is considered an exemplar by most stakeholders
  • 7. Good News: We Trust the PDB PDB Trust in the data is perhaps the PDB’s biggest achievement
  • 8. Good News: Trust  Trust is like compound interest  Comes from listening  Comes from engaging the community in every aspect of the process  Comes from data consistency and level of annotation  Comes from responsiveness  Comes from the quality of the delivery service
  • 9. Good News/Bad News Re Data Quality  Good News: – If done right in the beginning 25% of the PDB’s budget could have been saved – Ontologies can work – Automation has reduced cost even as the amount of data has increased – Reproducibility is improved  Bad News: – Complex ontologies slow adoption – All data are created equal – Annotation is limited
  • 10. Good News/Bad News Re Community  Good News: – The community is engaged – The community has driven data sharing  Bad News: – The community does not reduce costs through active participation – There is insufficient reward for being part of the community e.g. as an annotator
  • 11. How we do science is changing. Do data resources including the PDB best serve the needs of the user at this point?
  • 12. How is Science Changing?  More interdisciplinary  More translational  More access to diverse data types  More computational  More collaborative
  • 13. Good News/Bad News for the PDB in this Changing Landscape  Bad News: – Interface complex and uni-data oriented – Data accessible; methods accessible (sort of); but not together – Significant redundancy in services offered  Good News: – Annotation! – Demand is increasing – Integrated with other data types – Restful services
  • 14. General Problem Statement: How to insure a high quality annotated data source that provides the optimal environment for accessibility and analysis by a broad community of diverse users?
  • 15. Okay so what can the funders do to address a situation where really the PDB is currently a best case scenario?
  • 16. 1. Encourage more understanding for how existing data are used * http://www.cdc.gov/h1n1flu/estimates/April_March_13.htm Jan. 2008 Jan. 2009 Jan. 2010Jul. 2009Jul. 2008 Jul. 2010 1RUZ: 1918 H1 Hemagglutinin Structure Summary page activity for H1N1 Influenza related structures 3B7E: Neuraminidase of A/Brevig Mission/1/1918 H1N1 strain in complex with zanamivir [Andreas Prlic]
  • 17. We Need to Learn from Industries Whose Livelihood Addresses the Question of Use
  • 18. 2. Address the Issue that Scholarship is Broken  I have a paper with 17,500 citations that no one has ever read  I have papers in PLOS ONE that have more citations than ones in PNAS  I have data sets I am proud of few places to put them  I edited a journal but it did not count for much
  • 19. 3. Address the Reward System
  • 20. 4. Enable Reproducibility  Much of the research life cycle is now digital - encourage the reliability, accessibility, findability, usability of data, methods, narrative, publications etc.  How?  Data sharing plans  Standards frameworks  Data and software catalogs  PubMedCentral ? The Commons – PMC for the complete lifecycle ? Machine readable data sharing plans ? Small funding to communities ? Support for training and best practices in eScholarship
  • 21. 5. Establish The Commons  Public/private partnership  Work with IC’s, NCBI and CIT to identify and run pilots – cloud, HPC centers  Port DbGAP to the cloud ? Experiment with new funding strategies  Evaluate
  • 22. Sustainability and Sharing: The Commons Data The Long Tail Core Facilities/HS Centers Clinical /Patient The Why: Data Sharing Plans The Commons Government The How: Data Discovery Index Sustainable Storage Quality Scientific Discovery Usability Security/ Privacy Commons == Extramural NCBI == Research Object Sandbox == Collaborative Environment The End Game: KnowledgeNIH Awardees Private Sector Metrics/ Standards Rest of Academia Software Standards Index BD2K Centers Cloud, Research Objects,
  • 23. What Does the Commons Enable?  Dropbox like storage  The opportunity to apply quality metrics  Bring compute to the data  A place to collaborate  A place to discover http://100plus.com/wp-content/uploads/Data-Commons-3- 1024x825.png
  • 24. The PDB in the Commons  Components: – Annotated collection of data files – API’s to access these data files – Example methods using these APIs  Potential outcomes – Nothing happens? – A new breed of developer starts to use PDB data in new ways ? – The casual user has a broader set of services that previously? – Quality declines?
  • 25. Some Acknowledgements  Eric Green & Mark Guyer (NHGRI)  Jennie Larkin (NHLBI)  Leigh Finnegan (NHGRI)  Vivien Bonazzi (NHGRI)  Michelle Dunn (NCI)  Mike Huerta (NLM)  David Lipman (NLM)  Jim Ostell (NLM)  Andrea Norris (CIT)  Peter Lyster (NIGMS)  All the over 100 folks on the BD2K team
  • 26. NIHNIH…… Turning Discovery Into HealthTurning Discovery Into Health philip.bourne@nih.gov