SlideShare uma empresa Scribd logo
1 de 30
Data Science &
BD2K Update
Philip Bourne, PhD, FACMI
Associate Director for Data Science
Advisory Committee to the NIH Director
June 10, 2016
http://datascience.nih.gov
Slides: http://www.slideshare.net/pebourne
Data Science Agenda
• What problems are we trying to solve?
• What are the solutions we are exploring?
• How does BD2K facilitate those solutions?
What Problems Are We
Trying to Solve?
• Data are extensive, complex and growing
• Data are in silos while science transcends
those silos
• Data are expensive to maintain and share
while demands for sharing are increasing
• There is an insufficient workforce with the
needed data analytical skills
• A collective (trans NIH) solution
Quantifying the Problem
• Big Data
– Total data from NIH-funded research currently
estimated at 650 PB*
– 20 PB of that is in NCBI/NLM (3%) and it is expected
to grow by 10 PB this year
• Dark Data
– Only 12% of data described in published papers is in
recognized archives – 88% is dark data^
• Cost
– 2007-2014: NIH spent ~$1.2Bn extramurally on
maintaining data archives
* In 2012 Library of Congress was 3 PB
^ http://www.ncbi.nlm.nih.gov/pubmed/26207759
*Note: Award data confirmed as of 03/2016. Some repositories funded by hybrid mechanisms (eg. grants-contracts, IAA-contracts, etc.)
Biomedical Digital Data
Repository Survey by Institute
and Center (IC)
• Leadership meeting late in 2015 requested a
survey of IC approaches and plans for data
repositories
• Responses received from 18 IC’s
• Clear challenges were identified
The Major Challenge
Encountered When Considering
Repository Funding
Cost
(6 ICs)
Lack of Expertise
(Within and Outside
NIH)
(3 ICs)
Utility
(3 ICs)
Lack of
Trans-
NIH
Guidance
or
Best
Practices
(2 ICs)
Sustainability
(6 ICs)
Redundancy
(4 ICs)
* Some IC’s identified multiple challenges equally
What Solutions Are We
Exploring?
The Commons is one solution that
leverages the experiences in
cloud-based computing and is
being enabled by BD2K research
Examples of Cloud Based
Initiatives
5 PB
40TB AWS
The Commons – The Internet of
Data
• Findable
• Accessible
• Interoperable
• Reusable
* http://www.ncbi.nlm.nih.gov/pubmed/26978244
The Commons offers a path forward to integrate
these discreet cloud-based initiatives using BD2K
developments to make data FAIR*
The internet started as discreet networks that
merged - the same could happen with data
Use Case:
Aggregate integrated data offers
the potential for new insights into
rare diseases …
As we get more precise every disease becomes a rare disease
Diffuse Intrinsic Pontine
Gliomas (DIPG): In need of a
new data-driven approach
• Occur 1:100,000
individuals
• Peak incidence 6-8 years
of age
• Median survival 9-12
months
• Surgery is not an option
• Chemotherapy ineffective
and radiotherapy only
transitive
From Adam Resnick
Timeline of Genomic Studies
in DIPG
• Landmark studies
identify histone mutations
as recurrent driver
mutations in DIPG ~2012
• Almost 3 years later, in
largely the same
datasets, but partially
expanded, the same two
groups and 2 others
identify ACVR1
mutations as a
secondary, co-ocurring
mutation
From Adam Resnick
Hypothesis: The Commons
would have revealed ACVR1
• ACVR1 is a targetable kinase
• Inhibition of ACVR1 inhibited tumor
progression in vitro
• ~300 DIPG patients a year
• ~60 are predicted to have ACVR1
• If large scale data sets were only
integrated with TCGA and/or rare
disease data in 2012, ACVR1 mutations
would have been identified
• 60 patients/year X 3 years = 180
children’s lives (who likely succumbed to
the disease during that time) could have
been impacted if only data were FAIR
From Adam Resnick
A 3 Year BD2K Sponsored
Commons Pilot is Under Way
– Questions to be addressed:
• Does the ability to compute across very large
datasets lead to new discoveries?
• Are data and analytics more easily located and
shared and does this improve productivity?
• Is there an advantage to have the results of those
large calculations also available?
• Is research more reproducible?
• Is this environment more cost-effective than what
we do now?
Another use case…
Let’s review the Commons pilot
using the Model Organism
Databases (MODs) as an
example …
Example of the Problem:
The Model Organism Databases (MODS)
• Highly curated and
valuable data
• Siloed / Not
interoperable
• Cumbersome to
compute over all the
data
• Costly to maintain as
individual resources
NHGRI & NHLBI
Shared Research Objects
NCBI Intramural
Hybrid
Extramural
NCBI/NLM Existing Coordinating CentersCIT IC’s
Step 1: Data & Analytics
Moved to the Commons
• Moved as Commons compliant shared
research objects, including:
– Identifiers
– Minimal metadata standards
CF
MOD Data
BD2K is Providing Those
Metadata Standards
NCBI/NLM Existing Coordinating CentersCIT IC’s
Services: APIs, Containers, Indexing,Services: APIs, Containers, Indexing,
Software: Services & ToolsSoftware: Services & Tools
scientific analysis tools/workflowsscientific analysis tools/workflows
App store/User InterfaceApp store/User Interface
Step 2: Layers of Software &
Services Added
Shared Research Objects
NCBI Intramural
Hybrid
Extramural
DataMed is a “Find” Service
Developed by BD2K
MOD Data indexed
NCBI/NLM Existing Coordinating CentersCIT IC’s
Services: APIs, Containers, Indexing,Services: APIs, Containers, Indexing,
Software: Services & ToolsSoftware: Services & Tools
scientific analysis tools/workflowsscientific analysis tools/workflows
App store/User InterfaceApp store/User Interface
Step 3: Commons Content Shared
While Maintaining Autonomous
Views
CF
ICIC ICIC ICIC CFCF
Shared Research Objects
NCBI Intramural
Hybrid
Extramural
MOD Data
View
BD2K Commons Pilot
Timeline
Project Year 1
FY 2015
Oct 2015 – Sep 2016
Project Year 2
FY 2016
Oct 2016 – Sep 2017
Project Year 3
FY 2017
Oct 2017 – Sep 2018
Step 0Step 0 Step 1Step 1 Step 2Step 2 Step 3Step 3
Step 0: Initiation
•Finalize conformance requirements
•Arrange initial providers
Step 1: ~5 Initial Projects including Common Fund We Are Here
Step 2: ~50 projects
Step 3: Evaluation & Next Steps
The Major Challenge
Encountered When Considering
Repository Funding
Cost
(6 ICs)
Lack of Expertise
(Within and Outside
NIH)
(3 ICs)
Utility
(3 ICs)
Lack of
Trans-
NIH
Guidance
or
Best
Practices
(2 ICs)
Sustainability
(6 ICs)
Redundancy
(4 ICs)
* Some IC’s identified multiple challenges equally
16 T32/T15
Predoctoral
Training
Programs
21
Postdoctoral and
Faculty Career
Awards
Enhancing Diversity
• Focus on low-resourced institutions
– Supports curriculum and faculty development
– Supports research experiences for
undergraduates
• Builds partnerships with BD2K Centers
Improving Data Science
Skills Among all Biomedical
Scientists
24 awards
1 award
The Role of BD2K
1. Commons
– Resource
Indexing
– Standards
– Cloud & HPC
– Sustainability
2. Data Science
Research
– Centers
– Software
Analysis &
Methods
3. Training & Workforce Development
NIHNIH……
Turning Discovery Into HealthTurning Discovery Into Health
philip.bourne@nih.gov
https://datascience.nih.gov/
• Pi Day
• 2016 Lecture by Carlos Bustamante
• Poster Session with Pies
• PiCo Lightning Talks
• Pi Day Scholars: outreach to high schools
• Workshop: Reproducible Research
• Lecture Series: Distinguished and Frontiers in Data Science
• Data Science Courses
• Machine learning
• Hackathons
Data Science Events at NIH

Mais conteúdo relacionado

Mais procurados

A SWOT Analysis of Data Science @ NIH
A SWOT Analysis of Data Science @ NIHA SWOT Analysis of Data Science @ NIH
A SWOT Analysis of Data Science @ NIHPhilip Bourne
 
Big Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH PerspectiveBig Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH PerspectivePhilip Bourne
 
Big Data in Biomedicine: Where is the NIH Headed
Big Data in Biomedicine: Where is the NIH HeadedBig Data in Biomedicine: Where is the NIH Headed
Big Data in Biomedicine: Where is the NIH HeadedPhilip Bourne
 
Towards the Digital Research Enterprise
Towards the Digital Research EnterpriseTowards the Digital Research Enterprise
Towards the Digital Research EnterprisePhilip Bourne
 
Big Data as a Catalyst for Collaboration & Innovation
Big Data as a Catalyst for Collaboration & InnovationBig Data as a Catalyst for Collaboration & Innovation
Big Data as a Catalyst for Collaboration & InnovationPhilip Bourne
 
Open Science: Some Possible Actions by University Leaders on Behalf of Resear...
Open Science:Some Possible Actions by University Leaders on Behalf of Resear...Open Science:Some Possible Actions by University Leaders on Behalf of Resear...
Open Science: Some Possible Actions by University Leaders on Behalf of Resear...Philip Bourne
 
Making Biomedical Research More Like Airbnb
Making Biomedical Research More Like AirbnbMaking Biomedical Research More Like Airbnb
Making Biomedical Research More Like AirbnbPhilip Bourne
 
The NIH Commons: A Cloud-based Training Environment
The NIH Commons: A Cloud-based Training EnvironmentThe NIH Commons: A Cloud-based Training Environment
The NIH Commons: A Cloud-based Training EnvironmentPhilip Bourne
 
BD2K @ NIH - A Vision Through 2020
BD2K @ NIH - A Vision Through 2020BD2K @ NIH - A Vision Through 2020
BD2K @ NIH - A Vision Through 2020Philip Bourne
 
The Vision for Data @ the NIH
The Vision for Data @ the NIHThe Vision for Data @ the NIH
The Vision for Data @ the NIHPhilip Bourne
 
The NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAGThe NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAGPhilip Bourne
 
Big Data and the Promise and Pitfalls when Applied to Disease Prevention and ...
Big Data and the Promise and Pitfalls when Applied to Disease Prevention and ...Big Data and the Promise and Pitfalls when Applied to Disease Prevention and ...
Big Data and the Promise and Pitfalls when Applied to Disease Prevention and ...Philip Bourne
 
SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?Philip Bourne
 
Meeting the Computational Challenges Associated with Human Health
Meeting the Computational Challenges Associated with Human HealthMeeting the Computational Challenges Associated with Human Health
Meeting the Computational Challenges Associated with Human HealthPhilip Bourne
 
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and RealityA VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality Paul Courtney
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science LandscapePhilip Bourne
 
Health Policy and Management as it Relates to Big Data
Health Policy and Management as it Relates to Big DataHealth Policy and Management as it Relates to Big Data
Health Policy and Management as it Relates to Big DataPhilip Bourne
 
What Can Happen when Genome Sciences Meets Data Sciences?
What Can Happen when Genome Sciences Meets Data Sciences?What Can Happen when Genome Sciences Meets Data Sciences?
What Can Happen when Genome Sciences Meets Data Sciences?Philip Bourne
 

Mais procurados (20)

A SWOT Analysis of Data Science @ NIH
A SWOT Analysis of Data Science @ NIHA SWOT Analysis of Data Science @ NIH
A SWOT Analysis of Data Science @ NIH
 
Big Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH PerspectiveBig Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH Perspective
 
Big Data in Biomedicine: Where is the NIH Headed
Big Data in Biomedicine: Where is the NIH HeadedBig Data in Biomedicine: Where is the NIH Headed
Big Data in Biomedicine: Where is the NIH Headed
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
 
Towards the Digital Research Enterprise
Towards the Digital Research EnterpriseTowards the Digital Research Enterprise
Towards the Digital Research Enterprise
 
Big Data as a Catalyst for Collaboration & Innovation
Big Data as a Catalyst for Collaboration & InnovationBig Data as a Catalyst for Collaboration & Innovation
Big Data as a Catalyst for Collaboration & Innovation
 
Open Science: Some Possible Actions by University Leaders on Behalf of Resear...
Open Science:Some Possible Actions by University Leaders on Behalf of Resear...Open Science:Some Possible Actions by University Leaders on Behalf of Resear...
Open Science: Some Possible Actions by University Leaders on Behalf of Resear...
 
Making Biomedical Research More Like Airbnb
Making Biomedical Research More Like AirbnbMaking Biomedical Research More Like Airbnb
Making Biomedical Research More Like Airbnb
 
The NIH Commons: A Cloud-based Training Environment
The NIH Commons: A Cloud-based Training EnvironmentThe NIH Commons: A Cloud-based Training Environment
The NIH Commons: A Cloud-based Training Environment
 
BD2K @ NIH - A Vision Through 2020
BD2K @ NIH - A Vision Through 2020BD2K @ NIH - A Vision Through 2020
BD2K @ NIH - A Vision Through 2020
 
The Vision for Data @ the NIH
The Vision for Data @ the NIHThe Vision for Data @ the NIH
The Vision for Data @ the NIH
 
The NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAGThe NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAG
 
Big Data and the Promise and Pitfalls when Applied to Disease Prevention and ...
Big Data and the Promise and Pitfalls when Applied to Disease Prevention and ...Big Data and the Promise and Pitfalls when Applied to Disease Prevention and ...
Big Data and the Promise and Pitfalls when Applied to Disease Prevention and ...
 
SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?
 
Meeting the Computational Challenges Associated with Human Health
Meeting the Computational Challenges Associated with Human HealthMeeting the Computational Challenges Associated with Human Health
Meeting the Computational Challenges Associated with Human Health
 
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and RealityA VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
 
Health Policy and Management as it Relates to Big Data
Health Policy and Management as it Relates to Big DataHealth Policy and Management as it Relates to Big Data
Health Policy and Management as it Relates to Big Data
 
AMIA 2014
AMIA 2014AMIA 2014
AMIA 2014
 
What Can Happen when Genome Sciences Meets Data Sciences?
What Can Happen when Genome Sciences Meets Data Sciences?What Can Happen when Genome Sciences Meets Data Sciences?
What Can Happen when Genome Sciences Meets Data Sciences?
 

Semelhante a Data Science BD2K Update for NIH

The Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHThe Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHPhilip Bourne
 
2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...
2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...
2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...datacite
 
Opportunities and Challenges for International Cooperation Around Big Data
Opportunities and Challenges for International Cooperation Around Big DataOpportunities and Challenges for International Cooperation Around Big Data
Opportunities and Challenges for International Cooperation Around Big DataPhilip Bourne
 
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH     Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH Philip Bourne
 
Human Genome and Big Data Challenges
Human Genome and Big Data ChallengesHuman Genome and Big Data Challenges
Human Genome and Big Data ChallengesPhilip Bourne
 
Big Data and its Role in Biomedical Research
Big Data and its Role in Biomedical ResearchBig Data and its Role in Biomedical Research
Big Data and its Role in Biomedical ResearchPhilip Bourne
 
2016 09 cxo forum
2016 09 cxo forum2016 09 cxo forum
2016 09 cxo forumChris Dwan
 
Foundations for Discovery Informatics
Foundations for Discovery InformaticsFoundations for Discovery Informatics
Foundations for Discovery InformaticsPhilip Bourne
 
Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?Philip Bourne
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsMartin Donnelly
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data SharingAnita de Waard
 
From Data Sharing to Data Stewardship
From Data Sharing to Data StewardshipFrom Data Sharing to Data Stewardship
From Data Sharing to Data StewardshipICPSR
 
Workshop intro090314
Workshop intro090314Workshop intro090314
Workshop intro090314Philip Bourne
 
Data discovery and sharing at UCLH
Data discovery and sharing at UCLHData discovery and sharing at UCLH
Data discovery and sharing at UCLHJisc
 
Towards a Data Commons
Towards a Data CommonsTowards a Data Commons
Towards a Data CommonsMichael Becich
 

Semelhante a Data Science BD2K Update for NIH (20)

The Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHThe Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIH
 
Yale Day of Data
Yale Day of Data Yale Day of Data
Yale Day of Data
 
2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...
2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...
2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...
 
Opportunities and Challenges for International Cooperation Around Big Data
Opportunities and Challenges for International Cooperation Around Big DataOpportunities and Challenges for International Cooperation Around Big Data
Opportunities and Challenges for International Cooperation Around Big Data
 
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH     Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH
 
Human Genome and Big Data Challenges
Human Genome and Big Data ChallengesHuman Genome and Big Data Challenges
Human Genome and Big Data Challenges
 
Big Data and its Role in Biomedical Research
Big Data and its Role in Biomedical ResearchBig Data and its Role in Biomedical Research
Big Data and its Role in Biomedical Research
 
Shifting the goal post – from high impact journals to high impact data
 Shifting the goal post – from high impact journals to high impact data Shifting the goal post – from high impact journals to high impact data
Shifting the goal post – from high impact journals to high impact data
 
2016 09 cxo forum
2016 09 cxo forum2016 09 cxo forum
2016 09 cxo forum
 
Foundations for Discovery Informatics
Foundations for Discovery InformaticsFoundations for Discovery Informatics
Foundations for Discovery Informatics
 
Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and Solutions
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data Sharing
 
From Data Sharing to Data Stewardship
From Data Sharing to Data StewardshipFrom Data Sharing to Data Stewardship
From Data Sharing to Data Stewardship
 
Workshop intro090314
Workshop intro090314Workshop intro090314
Workshop intro090314
 
Data!
Data!Data!
Data!
 
Data discovery and sharing at UCLH
Data discovery and sharing at UCLHData discovery and sharing at UCLH
Data discovery and sharing at UCLH
 
Open Access as a Means to Produce High Quality Data
Open Access as a Means to Produce High Quality DataOpen Access as a Means to Produce High Quality Data
Open Access as a Means to Produce High Quality Data
 
Baker - Evolution of Data Products and Designated Audiences
Baker - Evolution of Data Products and Designated AudiencesBaker - Evolution of Data Products and Designated Audiences
Baker - Evolution of Data Products and Designated Audiences
 
Towards a Data Commons
Towards a Data CommonsTowards a Data Commons
Towards a Data Commons
 

Mais de Philip Bourne

Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 
AI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationAI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationPhilip Bourne
 
AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingPhilip Bourne
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityPhilip Bourne
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?Philip Bourne
 
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangeData Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangePhilip Bourne
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug DiscoveryPhilip Bourne
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AlonePhilip Bourne
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchPhilip Bourne
 
AI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data ScienceAI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data SciencePhilip Bourne
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewPhilip Bourne
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptxPhilip Bourne
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Philip Bourne
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision EducationPhilip Bourne
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data SciencePhilip Bourne
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Philip Bourne
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Philip Bourne
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance SustainabilityPhilip Bourne
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesPhilip Bourne
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in ResearchPhilip Bourne
 

Mais de Philip Bourne (20)

Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
AI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationAI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a Conversation
 
AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We Going
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data Sustainability
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?
 
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangeData Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything Change
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug Discovery
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not Alone
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in Research
 
AI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data ScienceAI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data Science
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's View
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptx
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision Education
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data Science
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance Sustainability
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular Scales
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in Research
 

Último

ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinojohnmickonozaleda
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 

Último (20)

YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipino
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 

Data Science BD2K Update for NIH

  • 1. Data Science & BD2K Update Philip Bourne, PhD, FACMI Associate Director for Data Science Advisory Committee to the NIH Director June 10, 2016 http://datascience.nih.gov Slides: http://www.slideshare.net/pebourne
  • 2. Data Science Agenda • What problems are we trying to solve? • What are the solutions we are exploring? • How does BD2K facilitate those solutions?
  • 3. What Problems Are We Trying to Solve? • Data are extensive, complex and growing • Data are in silos while science transcends those silos • Data are expensive to maintain and share while demands for sharing are increasing • There is an insufficient workforce with the needed data analytical skills • A collective (trans NIH) solution
  • 4. Quantifying the Problem • Big Data – Total data from NIH-funded research currently estimated at 650 PB* – 20 PB of that is in NCBI/NLM (3%) and it is expected to grow by 10 PB this year • Dark Data – Only 12% of data described in published papers is in recognized archives – 88% is dark data^ • Cost – 2007-2014: NIH spent ~$1.2Bn extramurally on maintaining data archives * In 2012 Library of Congress was 3 PB ^ http://www.ncbi.nlm.nih.gov/pubmed/26207759
  • 5. *Note: Award data confirmed as of 03/2016. Some repositories funded by hybrid mechanisms (eg. grants-contracts, IAA-contracts, etc.)
  • 6. Biomedical Digital Data Repository Survey by Institute and Center (IC) • Leadership meeting late in 2015 requested a survey of IC approaches and plans for data repositories • Responses received from 18 IC’s • Clear challenges were identified
  • 7. The Major Challenge Encountered When Considering Repository Funding Cost (6 ICs) Lack of Expertise (Within and Outside NIH) (3 ICs) Utility (3 ICs) Lack of Trans- NIH Guidance or Best Practices (2 ICs) Sustainability (6 ICs) Redundancy (4 ICs) * Some IC’s identified multiple challenges equally
  • 8. What Solutions Are We Exploring? The Commons is one solution that leverages the experiences in cloud-based computing and is being enabled by BD2K research
  • 9. Examples of Cloud Based Initiatives 5 PB 40TB AWS
  • 10. The Commons – The Internet of Data • Findable • Accessible • Interoperable • Reusable * http://www.ncbi.nlm.nih.gov/pubmed/26978244 The Commons offers a path forward to integrate these discreet cloud-based initiatives using BD2K developments to make data FAIR* The internet started as discreet networks that merged - the same could happen with data
  • 11. Use Case: Aggregate integrated data offers the potential for new insights into rare diseases … As we get more precise every disease becomes a rare disease
  • 12. Diffuse Intrinsic Pontine Gliomas (DIPG): In need of a new data-driven approach • Occur 1:100,000 individuals • Peak incidence 6-8 years of age • Median survival 9-12 months • Surgery is not an option • Chemotherapy ineffective and radiotherapy only transitive From Adam Resnick
  • 13. Timeline of Genomic Studies in DIPG • Landmark studies identify histone mutations as recurrent driver mutations in DIPG ~2012 • Almost 3 years later, in largely the same datasets, but partially expanded, the same two groups and 2 others identify ACVR1 mutations as a secondary, co-ocurring mutation From Adam Resnick
  • 14. Hypothesis: The Commons would have revealed ACVR1 • ACVR1 is a targetable kinase • Inhibition of ACVR1 inhibited tumor progression in vitro • ~300 DIPG patients a year • ~60 are predicted to have ACVR1 • If large scale data sets were only integrated with TCGA and/or rare disease data in 2012, ACVR1 mutations would have been identified • 60 patients/year X 3 years = 180 children’s lives (who likely succumbed to the disease during that time) could have been impacted if only data were FAIR From Adam Resnick
  • 15. A 3 Year BD2K Sponsored Commons Pilot is Under Way – Questions to be addressed: • Does the ability to compute across very large datasets lead to new discoveries? • Are data and analytics more easily located and shared and does this improve productivity? • Is there an advantage to have the results of those large calculations also available? • Is research more reproducible? • Is this environment more cost-effective than what we do now?
  • 16. Another use case… Let’s review the Commons pilot using the Model Organism Databases (MODs) as an example …
  • 17. Example of the Problem: The Model Organism Databases (MODS) • Highly curated and valuable data • Siloed / Not interoperable • Cumbersome to compute over all the data • Costly to maintain as individual resources NHGRI & NHLBI
  • 18. Shared Research Objects NCBI Intramural Hybrid Extramural NCBI/NLM Existing Coordinating CentersCIT IC’s Step 1: Data & Analytics Moved to the Commons • Moved as Commons compliant shared research objects, including: – Identifiers – Minimal metadata standards CF MOD Data
  • 19. BD2K is Providing Those Metadata Standards
  • 20. NCBI/NLM Existing Coordinating CentersCIT IC’s Services: APIs, Containers, Indexing,Services: APIs, Containers, Indexing, Software: Services & ToolsSoftware: Services & Tools scientific analysis tools/workflowsscientific analysis tools/workflows App store/User InterfaceApp store/User Interface Step 2: Layers of Software & Services Added Shared Research Objects NCBI Intramural Hybrid Extramural
  • 21. DataMed is a “Find” Service Developed by BD2K MOD Data indexed
  • 22. NCBI/NLM Existing Coordinating CentersCIT IC’s Services: APIs, Containers, Indexing,Services: APIs, Containers, Indexing, Software: Services & ToolsSoftware: Services & Tools scientific analysis tools/workflowsscientific analysis tools/workflows App store/User InterfaceApp store/User Interface Step 3: Commons Content Shared While Maintaining Autonomous Views CF ICIC ICIC ICIC CFCF Shared Research Objects NCBI Intramural Hybrid Extramural MOD Data View
  • 23. BD2K Commons Pilot Timeline Project Year 1 FY 2015 Oct 2015 – Sep 2016 Project Year 2 FY 2016 Oct 2016 – Sep 2017 Project Year 3 FY 2017 Oct 2017 – Sep 2018 Step 0Step 0 Step 1Step 1 Step 2Step 2 Step 3Step 3 Step 0: Initiation •Finalize conformance requirements •Arrange initial providers Step 1: ~5 Initial Projects including Common Fund We Are Here Step 2: ~50 projects Step 3: Evaluation & Next Steps
  • 24. The Major Challenge Encountered When Considering Repository Funding Cost (6 ICs) Lack of Expertise (Within and Outside NIH) (3 ICs) Utility (3 ICs) Lack of Trans- NIH Guidance or Best Practices (2 ICs) Sustainability (6 ICs) Redundancy (4 ICs) * Some IC’s identified multiple challenges equally
  • 26. Enhancing Diversity • Focus on low-resourced institutions – Supports curriculum and faculty development – Supports research experiences for undergraduates • Builds partnerships with BD2K Centers
  • 27. Improving Data Science Skills Among all Biomedical Scientists 24 awards 1 award
  • 28. The Role of BD2K 1. Commons – Resource Indexing – Standards – Cloud & HPC – Sustainability 2. Data Science Research – Centers – Software Analysis & Methods 3. Training & Workforce Development
  • 29. NIHNIH…… Turning Discovery Into HealthTurning Discovery Into Health philip.bourne@nih.gov https://datascience.nih.gov/
  • 30. • Pi Day • 2016 Lecture by Carlos Bustamante • Poster Session with Pies • PiCo Lightning Talks • Pi Day Scholars: outreach to high schools • Workshop: Reproducible Research • Lecture Series: Distinguished and Frontiers in Data Science • Data Science Courses • Machine learning • Hackathons Data Science Events at NIH

Notas do Editor

  1. Holdren memo GDS policy
  2. $1.25bn per year to capture all data. After a significant effort at reduction, intramurally data is spread across > 60 data centers; imagine the extramural situation.
  3. 30-40 of the 777 obligated awards were co-funded as opposed to funded by solely one IC Supplemental slides provides a further breakdown. Say ‘cannot be business as usual and that we must transition to a new model’
  4. Utility – see as a value proposition.
  5. Goal is to be able to easily draw upon data across these initiatives. Internet of data will see these individual initiatives merge.
  6. Other federal agencies are adopting this model and the EU is investing 6 bn Euros based on this model.
  7. We are currently in the pilot phase with small amounts of data/analytics being migrated.
  8. CEDAR is developing metadata templates SCC – coordinates standards to address such questions as “Is there a standard for data type x?” Working groups are spontaneous efforts between the centers to develop and share developments, including standards.
  9. The Commons enables data sharing across IC’s but also enables them to have an autonomous view on the data and services they provide.
  10. Utility – see as a value proposition.
  11. The biomedical data science pipeline is centered around the T32/T15 Predoctoral Training Programs. We are funding 16, each with 6 trainees, across the country. Over a quarter of the training budget is going to this program. Postdocs and beyond specialize in biomedical data science through mentored protected time for career development.
  12. Educational resource discovery index uses data science to help s learn about data science Automatically discovers training resources using information extraction (this is an international collaboration Organizes training resources through manual curation and resource modeling Personalizes recommendations through predictive modeling (future work)