A Successful Academic Medical Center Must be a Truly Digital Enterprise
A Successful Academic Medical
Center Must be a Truly Digital
Enterprise
Philip E. Bourne, PhD, FACMI
Associate Director for Data Science
National Institutes of Health
Nina Matheson Lecture
AAMC November 7, 2015
Available on Slideshare
What is My Job?
Change the Culture of NIH
What Do I Do Next Week?
The NIH Data Timeline
6/12 2/14 3/14
• Recommendations:
• Sharing data & software through catalogs
• Support methods and applications development
• Need more training
• Need campus-wide IT strategy
• Hire CSIO
• Continued support throughout the lifecycle
11/15
A Question I ask Myself A Lot…
Are we at a point of deception soon to
see a major disruption to our
institutions?
Some Folks Think So…
Evidence:
– Google car
– 3D printers
– Waze
– Robotics
– Sensors
From: The Second Machine Age: Work, Progress,
and Prosperity in a Time of Brilliant Technologies
by Erik Brynjolfsson & Andrew McAfee
We Are At a Point of Deception
The 6D Exponential Framework
Digitization of Basic &
Clinical Research & EHR’s
Deception
We Are Here
Disruption
Demonetization
Dematerialization
Democratization
Open science
Patient centered health care
For Academic Medical Centers What
Are the Implications of Such a Future?
Opportunities exist to improve the efficiency and
value of the enterprise
Open collaborative science becomes of increasing
importance
The value of data and associated analytics becomes
of increasing value to scholarship
Current training content and modalities will not match
supply to demand
Balancing accessibility vs security becomes more
important yet more complex
For Academic Medical Centers What
Are the Implications of Such a Future?
Opportunities exist to improve the efficiency and
value of the enterprise
Open collaborative science becomes of increasing
importance
The value of data and associated analytics becomes
of increasing value to scholarship
Current training content and modalities will not match
supply to demand
Balancing accessibility vs security becomes more
important yet more complex
Hypothetical Example of That Value
Jane scores extremely well in parts of her graduate on-line neurology class.
Neurology professors, whose research profiles are on-line and well described, are
automatically notified of Jane’s potential based on a computer analysis of her scores
against the background interests of the neuroscience professors. Consequently,
professor Smith interviews Jane and offers her a research rotation. During the
rotation she enters details of her experiments related to understanding a widespread
neurodegenerative disease in an on-line laboratory notebook kept in a shared on-line
research space – an institutional resource where stakeholders provide metadata,
including access rights and provenance beyond that available in a commercial
offering. According to Jane’s preferences, the underlying computer system may
automatically bring to Jane’s attention Jack, a graduate student in the chemistry
department whose notebook reveals he is working on using bacteria for purposes of
toxic waste cleanup. Why the connection? They reference the same gene a number
of times in their notes, which is of interest to two very different disciplines – neurology
and environmental sciences. In the analog academic health center they would never
have discovered each other, but thanks to the Digital Enterprise, pooled knowledge
can lead to a distinct advantage. The collaboration results in the discovery of a
homologous human gene product as a putative target in treating the
neurodegenerative disorder. A new chemical entity is developed and patented.
Accordingly, by automatically matching details of the innovation with biotech
companies worldwide that might have potential interest, a licensee is found. The
licensee hires Jack to continue working on the project. Jane joins Joe’s laboratory,
and he hires another student using the revenue from the license. The research
continues and leads to a federal grant award. The students are employed, further
research is supported and in time societal benefit arises from the technology.
From What Big Data Means to Me JAMIA 2014 21:194
How to Get There?
Recognize an institutions
assets are increasingly digital
Recognize the value of those
assets
Recognize that those assets
are siloed
Put in place a governance,
financial and infrastructure
model that breaks down
those silos while maintaining
community trust
That is, protect the integrity
of the assets
http://cdn.makeagif.com/media/4-01-2014/Km_F3w.gif
NIH Genomic Data Sharing (GDS)
Policy
Purpose
– Sets forth expectations, responsibilities that ensure broad,
responsible sharing of genomic research data in a timely
manner
Scope
– All NIH-funded research generating large-scale human or
non-human genomic data – and their use for subsequent
research
• Data to be submitted to NIH-designated data repositories
(e.g., dbGaP, GEO, GenBank, WormBase, FlyBase, Rat
Genome Database)
– Applies to all funding mechanisms (grants, contracts,
intramural support) with no minimum threshold for cost
Released August 2014; effective January 25, 2015
gds.nih.gov
Other Areas I Hope the SDC Will
Address
Sharing of other data types
Machine readable data sharing plans
Data citation
For Academic Medical Centers What
Are the Implications of Such a Future?
Opportunities exist to improve the efficiency and
value of the enterprise
Open collaborative science becomes of increasing
importance
The value of data and associated analytics becomes
of increasing value to scholarship
Current training content and modalities will not match
supply to demand
Balancing accessibility vs security becomes more
important yet more complex
“The HGP changed the norms around data sharing
in biomedical research.”
“The HGP changed the norms around data sharing
in biomedical research.”
Data Sharing Goes Global: GA4GH
Global Alliance for Genomics and
Health
Accelerating the potential of genomic medicine to
advance human health, by:
– Establishing common framework of approaches to enable
effective, responsible sharing of genomic and clinical data
– Catalyzing data sharing projects that drive and demonstrate
value of data sharing
Alliance*: >350 leading institutions (healthcare, research,
advocacy, life science, IT) representing 35 countries
Working groups (Clinical, Data, Security, Regulatory &
Ethics) assess, prioritize needs
– Form task teams to produce tools, solutions, demonstration
projects
*Statistics as of October 5, 2015
A Culture of Sharing
1999 20042003 2007 20142008
Research
Tools
Policy
NIH Data
Sharing Policy
Model
Organism
Policy
Genome-wide
Association
(GWAS) Policy
2012
NIH Public
Access Policy
(Publications)
Big Data to
Knowledge
(BD2K) Initiative
Genomic Data
Sharing (GDS)
Policy
Modernization of
NIH Clinical
Trials
White House
Initiative
(2013 “Holdren
Memo”)
Guiding Principle of NIH GWAS Policy
The greatest public benefit will be
realized if data from GWAS are made
available, under terms and conditions
consistent with the informed consent
provided by individual participants, in a
timely manner to the largest possible
number of investigators.
NIH expectation that data would be shared in the
NIH database of Genotype and Phenotype (dbGaP)
A Culture of Sharing
1999 20042003 2007 20142008
Research
Tools
Policy
NIH Data
Sharing Policy
Model
Organism
Policy
Genome-wide
Association
(GWAS) Policy
2012
NIH Public
Access Policy
(Publications)
Big Data to
Knowledge
(BD2K) Initiative
Genomic Data
Sharing (GDS)
Policy
Modernization of
NIH Clinical
Trials
White House
Initiative
(2013 “Holdren
Memo”)
NIH Public Access Policy for Publications
Ensures public access to published results of all
research funded by NIH since 2008
– Recipients of NIH funds required to submit final peer-
reviewed journal manuscripts to PubMed Central (PMC)
upon acceptance for publication
– Papers must be accessible to the public on PMC no later
than 12 months after publication
A Culture of Sharing
1999 20042003 2007 20142008
Research
Tools
Policy
NIH Data
Sharing Policy
Model
Organism
Policy
Genome-wide
Association
(GWAS) Policy
2012
NIH Public
Access Policy
(Publications)
Big Data to
Knowledge
(BD2K) Initiative
Genomic Data
Sharing (GDS)
Policy
Modernization of
NIH Clinical
Trials
White House
Initiative
(2013 “Holdren
Memo”)
Harnessing Data to Improve Health:
BD2K (Big Data to Knowledge)
NIH’s 6-year initiative to use data science to foster an
open digital ecosystem that will accelerate efficient,
cost-effective biomedical research to enhance health,
lengthen life, and reduce illness and disability
Programs and activities:
Advance discovery for biomedical research
Facilitate use and re-use of biomedical data
Develop analytical methods and software
Enhance biomedical data science training
The Commons
Digital Object Compliance: FAIR
Attributes of digital objects in the Commons
Initial Phase
• Unique digital object identifiers of some type
• A minimal set of searchable metadata
• Physically available in a cloud based Commons provider
• Clear access rules (especially important for human subjects data)
• An entry (with metadata) in one or more indices
– Future Phases
• Standard, community based unique digital object identifiers
• Conform to community approved standard metadata for enhanced
searching
• Digital objects accessible via open standard APIs
• Are physically and logical available to the commons
For Academic Medical Centers What
Are the Implications of Such a Future?
Opportunities exist to improve the efficiency and
value of the enterprise
Open collaborative science becomes of increasing
importance
The value of data and associated analytics becomes
of increasing value to scholarship
Current training content and modalities will not match
supply to demand
Balancing accessibility vs security becomes more
important yet more complex
BD2K and Clinical Data Science Research
BD2K Centers of Excellence for Big Data
Computing
BD2K Targeted Software Topics
Challenges and Prizes
1. NIH-NSF IDEAS Lab
• Promotes New Collaborations
• Round 1 on Precision Medicine (August 2015), round 2 in
planning.
2. BD2K-Wellcome Trust-HHMI Open Science Prize
• Prize competition announced October 20, 2015.
• Supports development of technology platforms and tools that
make open biomedical data more discoverable, accessible,
analyzable, and citable
BD2K Targeted Software Topics
Supports innovative analytical methods and software tools
that address critical current and emerging needs of the
biomedical research
2015 Topics (18 awards, U01s)
– Data Compression
– Data Provenance
– Data Visualization
– Data Wrangling
2016 Topics (U01s, under review)
– Data Privacy
– Data Repurposing
– Applying Metadata
– 2016: Crowdsourcing and interactive Digital Media
(UH2)
For Academic Medical Centers What
Are the Implications of Such a Future?
Opportunities exist to improve the efficiency and
value of the enterprise
Open collaborative science becomes of increasing
importance
The value of data and associated analytics becomes
of increasing value to scholarship
Current training content and modalities will not match
supply to demand
Balancing accessibility vs security becomes more
important yet more complex
For Academic Medical Centers What
Are the Implications of Such a Future?
Opportunities exist to improve the efficiency and
value of the enterprise
Open collaborative science becomes of increasing
importance
The value of data and associated analytics becomes
of increasing value to scholarship
Current training content and modalities will not match
supply to demand
Balancing accessibility vs security becomes more
important yet more complex
The Problem Statement
Access to digital research objects
when, how, and by whom are
authorized to access them in
accordance of the wishes of the
owner and/or laws and policies which
define accessibility
The Landscape
The Holdren Memo
Revisions to the Common Rule
Meaningful Use
Centralized IRBs
….
“And that’s why we’re here today. Because something
called precision medicine … gives us one of the greatest
opportunities for new medical breakthroughs that we
have ever seen.”
President Barack Obama
January 30, 2015
An Example of That Promise:
Comorbidity Network for 6.2M Danes
Over 14.9 Years
Jensen et al 2014 Nat Comm 5:4022
I not only use all the brains
I have, but all I can borrow.
– Woodrow Wilson
NIHNIH……
Turning Discovery Into HealthTurning Discovery Into Health
philip.bourne@nih.gov
https://datascience.nih.gov/
http://www.ncbi.nlm.nih.gov/research/staff/bourne/
A Culture of Sharing
1999 20042003 2007 20142008
Research
Tools
Policy
NIH Data
Sharing Policy
Model
Organism
Policy
Genome-wide
Association
(GWAS) Policy
2012
NIH Public
Access Policy
(Publications)
Big Data to
Knowledge
(BD2K) Initiative
Genomic Data
Sharing (GDS)
Policy
Modernization of
NIH Clinical
Trials
White House
Initiative
(2013 “Holdren
Memo”)
A Culture of Sharing
1999 20042003 2007 20142008
Research
Tools
Policy
NIH Data
Sharing Policy
Model
Organism
Policy
Genome-wide
Association
(GWAS) Policy
2012
NIH Public
Access Policy
(Publications)
Big Data to
Knowledge
(BD2K) Initiative
Genomic Data
Sharing (GDS)
Policy
Modernization of
NIH Clinical
Trials
White House
Initiative
(2013 “Holdren
Memo”)
Modernizing NIH Clinical Trials
Activities:
The Need
NIH-Funded trials published within 100 months of
completion
Less than 50% published within 30 months of completion
BMJ 2012;344:d7292
Increasing Clinical Trial Transparency
Proposed November 2014; Final Spring 2016 (est.)
Notice of Proposed Rulemaking: Clinical Trials
Registration and Results Submission (FDAAA, Section
801)
– Further implements statutory requirements on private and
public sponsors to register; report results on phase 2, 3,
and 4 trials
– Includes drugs, biologics, and devices (except small
feasibility)
Draft NIH Policy on Clinical Trial Information
Dissemination
– Extends Section 801 requirements to all NIH-funded clinical
trials
– Includes phase 1 trials and trials of non-FDA regulated
interventions such as behavioral trials
BD2K Targeted Software Topics
Supports innovative analytical methods and software tools
that address critical current and emerging needs of the
biomedical research
2015 Topics (18 awards, U01s)
– Data Compression
– Data Provenance
– Data Visualization
– Data Wrangling
2016 Topics (U01s, under review)
– Data Privacy
– Data Repurposing
– Applying Metadata
– 2016: Crowdsourcing and interactive Digital Media
(UH2)
Why Revisions to the Common Rule
is not sufficiently risk-based, resulting in both over- and
under-regulation of research activities;,,
is not tailored to new and emerging areas of research,
including social and behavioral research and research
involving the collection and use of genetic information
Infectious Disease Society of America. Grinding to a halt:
The effects of the increasing regulatory burden on
research and quality improvement efforts.
may not effectively inform subjects of psychological,
informational, or privacy risks;,, ,
does not adequately account for the needs of a “learning”
health-care system for continual quality improvement;,,
and
provides insufficient mechanisms to ensure the
consistency, quality, and accountability of IRB decision-
making.,,,
Notas do Editor
“As biology’s first large-scale project, the HGP paved the way for numerous consortium-based research ventures. The NHGRI alone has been involved in launching more than 25 such projects since 2000. These have presented new challenges to biomedical research — demanding, for instance, that diverse groups from different countries and disciplines come together to share and analyse vast data sets.”
“The HGP changed the norms around data sharing in biomedical research.”
2013 White House Initiative: “Increasing Access to the Results of Federally Funded Scientific Research”
Updated to include numbers through September 2015.
From Dina Paltoo [10/6/15]: “The data in the first slide is for all of dbGaP 2007-2014. The information came from a version of what is on the GDS website (https://gds.nih.gov/19dataaccesscommitteereview_dbGaP.html) and in a Nature Genetics paper (http://www.nature.com/ng/journal/v46/n9/full/ng.3062.html), but results from information that we receive from NCBI.”
The NIH Public Access Policy implements Division F Section 217 of PL 111-8 (Omnibus Appropriations Act, 2009).
http://publicaccess.nih.gov/policy.htm
OSP’s summary:
The NIH Public Access Policy for publications has been in a requirement for all recipients of NIH funds since 2008. It implements Division G, Title II, Section 218 of PL 110-161 (Consolidated Appropriations Act, 2008). The NIH Public Access Policy ensures that the public has access to the published results of NIH-funded research. It requires scientists to submit final peer-reviewed journal manuscripts that arise from NIH funds to the digital archive PubMed Central (PMC) upon acceptance for publication. Scientists can also deposit papers through partnerships NIH has established with publishers. To help advance science and improve human health, the Policy requires that NIH supported papers are accessible to the public on PMC no later than 12 months after publication.
Updated by ADDS group 8/25/15
Short term: produce a searchable catalog of physical and virtual courses; Funding diversity awards to work with BD2K Centers; Expand IRP training started Jan 2015 e.g. Software carpentry and Train the trainers
Long term: evaluation
Photos: FC tweet; RK screen grab
16 million hospital inpatient events (24.5% of total), 35 million outpatient clinic events (53.6% of total) and 14 million emergency
department events (21.9% of total
Figure 2. Cumulative percentage of studies published in a peer reviewed biomedical journal indexed by Medline during 100 months after trial completion among all NIH funded clinical trials registered within ClinicalTrials.gov
Public benefits to clinical trials data-sharing (OSP):
Inform future research and research funding decisions
Mitigate bias (e.g., non publication of results, especially negative results)
Prevent duplication of unsafe trials
Meet ethical obligation to human subjects (i.e., that results inform science)
Increase access to data about marketed products
All contribute to public trust in clinical research
Source: Ross JS, Tse T, Zarin DA, Xu H, Zhou L, Krumholz HM. Publication of NIH funded trials registered in ClinicalTrials.gov: cross-sectional analysis. BMJ 2012;344:d7292.
Text updated by Sarah Carr [10/7/2015] – also changed order to feature NPRM before Draft NIH Policy.
Nearly 900 Comments received on PPRM: Many simply stating broad support
Final Rule expected Spring 2016
Section 801 of the Food and Drug Administration Amendments Act (FDAAA)