Drivers for data sharing in funding of biomedical research. Importance of data sharing on open science, innovation, reproducibility that is enabled by digital technologies and data science.
1. Data Sharing in the Digital Age
Warren A. Kibbe, Ph.D.
Professor, Biostats & Bioinformatics
Chief Data Officer, Duke Cancer Institute
warren.kibbe@duke.edu
@wakibbe #DataSharing
2. Take homes
• 4th Industrial Revolution is here
• Biomedical research and medicine is
a data enterprise
• Data sharing stimulates insights,
discoveries, translation, therapies
3. Science has evolved
• The role of societies and journals
have slowly evolved during the past
150 years
• Our ability to generate data has
rapidly evolved in the digital age
• Sharing knowledge, primary data,
observations is best done outside the
current publication process
4. How did we get here
• A slight digression taking a narrow
slice of history, technology, and
science
6. First Industrial Revolution
Humans having access to cheap energy
to do work (steam)
William M. Connolley - Picture of the "Puffing Billy" steam
engine taken in the Science Museum. on 2004/03/13.
https://en.wikipedia.org/wiki/Steam_engine
taken May 2018
7. First Industrial Revolution
In addition to changing manufacturing
and transportation, steam changed
printing
Meggs, Philip B. A History of Graphic Design. John Wiley & Sons, Inc. 1998. (p 132)
8. Scientific Publication 1.0
Journals, primarily
the output of
scientific societies,
began in the 1860s.
Peer review became
the norm after WWII
First issue of Nature, 1869
9. Second Industrial Revolution
Mass production, better materials
(steel) and manufacturing, distribution
of energy using electricity & petroleum
Robert Friedrich Stieler (1847–1908) - alte Postkarte, https://www.basf.com/de/company/about-us/history/1865-1901.html
10. Third Industrial Revolution
The Digital Revolution
2008-03-19 21:41 Transistor from Wikipedia
10 August 2016 Thomas Nguyen - Wikipedia
Mike1024 from wikipedia - University of
Warwick 2006
11. Fourth Industrial Revolution
Data and Connectivity
• Communications
• Connectivity
• Ubiquitous
• Pervasive
• Internet of Things
• Embedded Sensors
• Process Automation
• Cloud Computing
Mass access to data generation, processing, visualization
13. Scientific publication 2.0
• Journals are fully digital
• Work flow automated
• Open dissemination after embargo
• Open Access should be required
• Datasets and analysis must be available
for review and validation
• Data sharing of primary data should be
the norm for precompetitive research
14. Scientific publication 2.0
• Observations (data!) are
accumulating at a rapid pace
• Insights, discoveries, inventions,
information, analytics, and
knowledge all emerge from evidence
(data)
• IMO - We need to separate data
sharing from knowledge sharing
15. Validation and Semantic Harmonization
• Validation and Semantic
Harmonization of primary and
secondary data is crucial to reuse,
reproducibility and interoperability.
Ideally this is part of the data
sharing process
16. Data Sharing and the FAIR Principles
FAIR –
Making data
Findable,
Accessible,
Attributable,
Interoperable,
Reusable,
and provide Recognition
Force11 white paper
https://www.force11.org/group/fairgroup/fairprinciples
17. Data Sharing Index
• We need metrics for data, software,
algorithm use, usability, conformance
• Data sharing stimulates science,
innovation, commercialization
• Providing recognition and attribution
to data providers and software &
algorithm builders is critical for a
robust data sharing ecosystem
The printing press enabled social change through widely disseminating print – this increased the value of literacy and made censorship of information harder.
For the first time people living in a society have access to much more than just the amount of work that they can do with their muscles or with domesticated animals. This opened up the possibility of creating machines and machinery at a very different scale. It also started to improve transportation
Semiconductors, VLSI, miniaturization, hardware and software, transition from analog to digital devices
It is only through data sharing that we unlock the value of data